1 diff -urN linux-2.4.20/Documentation/sched-coding.txt linux-2.4.20-o1/Documentation/sched-coding.txt
2 --- linux-2.4.20/Documentation/sched-coding.txt Thu Jan 1 01:00:00 1970
3 +++ linux-2.4.20-o1/Documentation/sched-coding.txt Wed Mar 12 00:41:43 2003
5 + Reference for various scheduler-related methods in the O(1) scheduler
6 + Robert Love <rml@tech9.net>, MontaVista Software
9 +Note most of these methods are local to kernel/sched.c - this is by design.
10 +The scheduler is meant to be self-contained and abstracted away. This document
11 +is primarily for understanding the scheduler, not interfacing to it. Some of
12 +the discussed interfaces, however, are general process/scheduling methods.
13 +They are typically defined in include/linux/sched.h.
16 +Main Scheduling Methods
17 +-----------------------
19 +void load_balance(runqueue_t *this_rq, int idle)
20 + Attempts to pull tasks from one cpu to another to balance cpu usage,
21 + if needed. This method is called explicitly if the runqueues are
22 + inbalanced or periodically by the timer tick. Prior to calling,
23 + the current runqueue must be locked and interrupts disabled.
26 + The main scheduling function. Upon return, the highest priority
27 + process will be active.
33 +Each runqueue has its own lock, rq->lock. When multiple runqueues need
34 +to be locked, lock acquires must be ordered by ascending &runqueue value.
36 +A specific runqueue is locked via
38 + task_rq_lock(task_t pid, unsigned long *flags)
40 +which disables preemption, disables interrupts, and locks the runqueue pid is
41 +running on. Likewise,
43 + task_rq_unlock(task_t pid, unsigned long *flags)
45 +unlocks the runqueue pid is running on, restores interrupts to their previous
46 +state, and reenables preemption.
50 + double_rq_lock(runqueue_t *rq1, runqueue_t *rq2)
54 + double_rq_unlock(runqueue_t *rq1, runqueue_t rq2)
56 +safely lock and unlock, respectively, the two specified runqueues. They do
57 +not, however, disable and restore interrupts. Users are required to do so
58 +manually before and after calls.
65 + The maximum priority of the system, stored in the task as task->prio.
66 + Lower priorities are higher. Normal (non-RT) priorities range from
67 + MAX_RT_PRIO to (MAX_PRIO - 1).
69 + The maximum real-time priority of the system. Valid RT priorities
70 + range from 0 to (MAX_RT_PRIO - 1).
72 + The maximum real-time priority that is exported to user-space. Should
73 + always be equal to or less than MAX_RT_PRIO. Setting it less allows
74 + kernel threads to have higher priorities than any user-space task.
77 + Respectively, the minimum and maximum timeslices (quanta) of a process.
83 + The main per-CPU runqueue data structure.
85 + The main per-process data structure.
92 + Returns the runqueue of the specified cpu.
94 + Returns the runqueue of the current cpu.
96 + Returns the runqueue which holds the specified pid.
98 + Returns the task currently running on the given cpu.
100 + Returns true if pid is real-time, false if not.
103 +Process Control Methods
104 +-----------------------
106 +void set_user_nice(task_t *p, long nice)
107 + Sets the "nice" value of task p to the given value.
108 +int setscheduler(pid_t pid, int policy, struct sched_param *param)
109 + Sets the scheduling policy and parameters for the given pid.
110 +void set_cpus_allowed(task_t *p, unsigned long new_mask)
111 + Sets a given task's CPU affinity and migrates it to a proper cpu.
112 + Callers must have a valid reference to the task and assure the
113 + task not exit prematurely. No locks can be held during the call.
114 +set_task_state(tsk, state_value)
115 + Sets the given task's state to the given value.
116 +set_current_state(state_value)
117 + Sets the current task's state to the given value.
118 +void set_tsk_need_resched(struct task_struct *tsk)
119 + Sets need_resched in the given task.
120 +void clear_tsk_need_resched(struct task_struct *tsk)
121 + Clears need_resched in the given task.
122 +void set_need_resched()
123 + Sets need_resched in the current task.
124 +void clear_need_resched()
125 + Clears need_resched in the current task.
127 + Returns true if need_resched is set in the current task, false
130 + Place the current process at the end of the runqueue and call schedule.
131 diff -urN linux-2.4.20/Documentation/sched-design.txt linux-2.4.20-o1/Documentation/sched-design.txt
132 --- linux-2.4.20/Documentation/sched-design.txt Thu Jan 1 01:00:00 1970
133 +++ linux-2.4.20-o1/Documentation/sched-design.txt Wed Mar 12 00:41:43 2003
135 + Goals, Design and Implementation of the
136 + new ultra-scalable O(1) scheduler
139 + This is an edited version of an email Ingo Molnar sent to
140 + lkml on 4 Jan 2002. It describes the goals, design, and
141 + implementation of Ingo's new ultra-scalable O(1) scheduler.
142 + Last Updated: 18 April 2002.
148 +The main goal of the new scheduler is to keep all the good things we know
149 +and love about the current Linux scheduler:
151 + - good interactive performance even during high load: if the user
152 + types or clicks then the system must react instantly and must execute
153 + the user tasks smoothly, even during considerable background load.
155 + - good scheduling/wakeup performance with 1-2 runnable processes.
157 + - fairness: no process should stay without any timeslice for any
158 + unreasonable amount of time. No process should get an unjustly high
159 + amount of CPU time.
161 + - priorities: less important tasks can be started with lower priority,
162 + more important tasks with higher priority.
164 + - SMP efficiency: no CPU should stay idle if there is work to do.
166 + - SMP affinity: processes which run on one CPU should stay affine to
167 + that CPU. Processes should not bounce between CPUs too frequently.
169 + - plus additional scheduler features: RT scheduling, CPU binding.
171 +and the goal is also to add a few new things:
173 + - fully O(1) scheduling. Are you tired of the recalculation loop
174 + blowing the L1 cache away every now and then? Do you think the goodness
175 + loop is taking a bit too long to finish if there are lots of runnable
176 + processes? This new scheduler takes no prisoners: wakeup(), schedule(),
177 + the timer interrupt are all O(1) algorithms. There is no recalculation
178 + loop. There is no goodness loop either.
180 + - 'perfect' SMP scalability. With the new scheduler there is no 'big'
181 + runqueue_lock anymore - it's all per-CPU runqueues and locks - two
182 + tasks on two separate CPUs can wake up, schedule and context-switch
183 + completely in parallel, without any interlocking. All
184 + scheduling-relevant data is structured for maximum scalability.
186 + - better SMP affinity. The old scheduler has a particular weakness that
187 + causes the random bouncing of tasks between CPUs if/when higher
188 + priority/interactive tasks, this was observed and reported by many
189 + people. The reason is that the timeslice recalculation loop first needs
190 + every currently running task to consume its timeslice. But when this
191 + happens on eg. an 8-way system, then this property starves an
192 + increasing number of CPUs from executing any process. Once the last
193 + task that has a timeslice left has finished using up that timeslice,
194 + the recalculation loop is triggered and other CPUs can start executing
195 + tasks again - after having idled around for a number of timer ticks.
196 + The more CPUs, the worse this effect.
198 + Furthermore, this same effect causes the bouncing effect as well:
199 + whenever there is such a 'timeslice squeeze' of the global runqueue,
200 + idle processors start executing tasks which are not affine to that CPU.
201 + (because the affine tasks have finished off their timeslices already.)
203 + The new scheduler solves this problem by distributing timeslices on a
204 + per-CPU basis, without having any global synchronization or
207 + - batch scheduling. A significant proportion of computing-intensive tasks
208 + benefit from batch-scheduling, where timeslices are long and processes
209 + are roundrobin scheduled. The new scheduler does such batch-scheduling
210 + of the lowest priority tasks - so nice +19 jobs will get
211 + 'batch-scheduled' automatically. With this scheduler, nice +19 jobs are
212 + in essence SCHED_IDLE, from an interactiveness point of view.
214 + - handle extreme loads more smoothly, without breakdown and scheduling
217 + - O(1) RT scheduling. For those RT folks who are paranoid about the
218 + O(nr_running) property of the goodness loop and the recalculation loop.
220 + - run fork()ed children before the parent. Andrea has pointed out the
221 + advantages of this a few months ago, but patches for this feature
222 + do not work with the old scheduler as well as they should,
223 + because idle processes often steal the new child before the fork()ing
224 + CPU gets to execute it.
230 +the core of the new scheduler are the following mechanizms:
232 + - *two*, priority-ordered 'priority arrays' per CPU. There is an 'active'
233 + array and an 'expired' array. The active array contains all tasks that
234 + are affine to this CPU and have timeslices left. The expired array
235 + contains all tasks which have used up their timeslices - but this array
236 + is kept sorted as well. The active and expired array is not accessed
237 + directly, it's accessed through two pointers in the per-CPU runqueue
238 + structure. If all active tasks are used up then we 'switch' the two
239 + pointers and from now on the ready-to-go (former-) expired array is the
240 + active array - and the empty active array serves as the new collector
243 + - there is a 64-bit bitmap cache for array indices. Finding the highest
244 + priority task is thus a matter of two x86 BSFL bit-search instructions.
246 +the split-array solution enables us to have an arbitrary number of active
247 +and expired tasks, and the recalculation of timeslices can be done
248 +immediately when the timeslice expires. Because the arrays are always
249 +access through the pointers in the runqueue, switching the two arrays can
250 +be done very quickly.
252 +this is a hybride priority-list approach coupled with roundrobin
253 +scheduling and the array-switch method of distributing timeslices.
255 + - there is a per-task 'load estimator'.
257 +one of the toughest things to get right is good interactive feel during
258 +heavy system load. While playing with various scheduler variants i found
259 +that the best interactive feel is achieved not by 'boosting' interactive
260 +tasks, but by 'punishing' tasks that want to use more CPU time than there
261 +is available. This method is also much easier to do in an O(1) fashion.
263 +to establish the actual 'load' the task contributes to the system, a
264 +complex-looking but pretty accurate method is used: there is a 4-entry
265 +'history' ringbuffer of the task's activities during the last 4 seconds.
266 +This ringbuffer is operated without much overhead. The entries tell the
267 +scheduler a pretty accurate load-history of the task: has it used up more
268 +CPU time or less during the past N seconds. [the size '4' and the interval
269 +of 4x 1 seconds was found by lots of experimentation - this part is
270 +flexible and can be changed in both directions.]
272 +the penalty a task gets for generating more load than the CPU can handle
273 +is a priority decrease - there is a maximum amount to this penalty
274 +relative to their static priority, so even fully CPU-bound tasks will
275 +observe each other's priorities, and will share the CPU accordingly.
277 +the SMP load-balancer can be extended/switched with additional parallel
278 +computing and cache hierarchy concepts: NUMA scheduling, multi-core CPUs
279 +can be supported easily by changing the load-balancer. Right now it's
280 +tuned for my SMP systems.
282 +i skipped the prev->mm == next->mm advantage - no workload i know of shows
283 +any sensitivity to this. It can be added back by sacrificing O(1)
284 +schedule() [the current and one-lower priority list can be searched for a
285 +that->mm == current->mm condition], but costs a fair number of cycles
286 +during a number of important workloads, so i wanted to avoid this as much
289 +- the SMP idle-task startup code was still racy and the new scheduler
290 +triggered this. So i streamlined the idle-setup code a bit. We do not call
291 +into schedule() before all processors have started up fully and all idle
292 +threads are in place.
294 +- the patch also cleans up a number of aspects of sched.c - moves code
295 +into other areas of the kernel where it's appropriate, and simplifies
296 +certain code paths and data constructs. As a result, the new scheduler's
297 +code is smaller than the old one.
300 diff -urN linux-2.4.20/arch/alpha/kernel/entry.S linux-2.4.20-o1/arch/alpha/kernel/entry.S
301 --- linux-2.4.20/arch/alpha/kernel/entry.S Sat Aug 3 02:39:42 2002
302 +++ linux-2.4.20-o1/arch/alpha/kernel/entry.S Wed Mar 12 00:41:43 2003
305 lda $26,ret_from_sys_call
308 jsr $31,schedule_tail
313 diff -urN linux-2.4.20/arch/alpha/kernel/process.c linux-2.4.20-o1/arch/alpha/kernel/process.c
314 --- linux-2.4.20/arch/alpha/kernel/process.c Sun Sep 30 21:26:08 2001
315 +++ linux-2.4.20-o1/arch/alpha/kernel/process.c Wed Mar 12 00:41:43 2003
319 /* An endless idle loop with no priority at all. */
320 - current->nice = 20;
321 - current->counter = -100;
324 /* FIXME -- EV6 and LCA45 know how to power down
326 diff -urN linux-2.4.20/arch/alpha/kernel/smp.c linux-2.4.20-o1/arch/alpha/kernel/smp.c
327 --- linux-2.4.20/arch/alpha/kernel/smp.c Sat Aug 3 02:39:42 2002
328 +++ linux-2.4.20-o1/arch/alpha/kernel/smp.c Wed Mar 12 00:41:43 2003
330 int smp_num_cpus = 1; /* Number that came online. */
331 int smp_threads_ready; /* True once the per process idle is forked. */
332 cycles_t cacheflush_time;
333 +unsigned long cache_decay_ticks;
335 int __cpu_number_map[NR_CPUS];
336 int __cpu_logical_map[NR_CPUS];
339 int cpuid = hard_smp_processor_id();
341 - if (current != init_tasks[cpu_number_map(cpuid)]) {
342 - printk("BUG: smp_calling: cpu %d current %p init_tasks[cpu_number_map(cpuid)] %p\n",
343 - cpuid, current, init_tasks[cpu_number_map(cpuid)]);
346 DBGS(("CALLIN %d state 0x%lx\n", cpuid, current->state));
348 /* Turn on machine checks. */
350 DBGS(("smp_callin: commencing CPU %d current %p\n",
353 - /* Setup the scheduler for this processor. */
356 /* ??? This should be in init_idle. */
357 atomic_inc(&init_mm.mm_count);
358 current->active_mm = &init_mm;
360 smp_tune_scheduling (int cpuid)
362 struct percpu_struct *cpu;
363 - unsigned long on_chip_cache;
364 - unsigned long freq;
365 + unsigned long on_chip_cache; /* kB */
366 + unsigned long freq; /* Hz */
367 + unsigned long bandwidth = 350; /* MB/s */
369 cpu = (struct percpu_struct*)((char*)hwrpb + hwrpb->processor_offset
370 + cpuid * hwrpb->processor_size);
371 @@ -258,29 +252,21 @@
375 - on_chip_cache = 64 + 64;
379 - on_chip_cache = 8 + 8;
380 + on_chip_cache = 64 + 64;
384 freq = hwrpb->cycle_freq ? : est_cycle_freq;
387 - /* Magic estimation stolen from x86 port. */
388 - cacheflush_time = freq / 1024L * on_chip_cache / 5000L;
390 - printk("Using heuristic of %d cycles.\n",
393 - /* Magic value to force potential preemption of other CPUs. */
394 - cacheflush_time = INT_MAX;
395 + cacheflush_time = (freq / 1000000) * (on_chip_cache << 10) / bandwidth;
396 + cache_decay_ticks = cacheflush_time / (freq / 1000) * HZ / 1000;
398 - printk("Using heuristic of %d cycles.\n",
401 + printk("per-CPU timeslice cutoff: %ld.%02ld usecs.\n",
402 + cacheflush_time/(freq/1000000),
403 + (cacheflush_time*100/(freq/1000000)) % 100);
404 + printk("task migration cache decay timeout: %ld msecs.\n",
405 + (cache_decay_ticks + 1) * 1000 / HZ);
409 @@ -505,14 +491,11 @@
410 if (idle == &init_task)
411 panic("idle process is init_task for CPU %d", cpuid);
413 - idle->processor = cpuid;
414 - idle->cpus_runnable = 1 << cpuid; /* we schedule the first task manually */
415 + init_idle(idle, cpuid);
416 + unhash_process(idle);
418 __cpu_logical_map[cpunum] = cpuid;
419 __cpu_number_map[cpuid] = cpunum;
421 - del_from_runqueue(idle);
422 - unhash_process(idle);
423 - init_tasks[cpunum] = idle;
425 DBGS(("smp_boot_one_cpu: CPU %d state 0x%lx flags 0x%lx\n",
426 cpuid, idle->state, idle->flags));
427 @@ -619,13 +602,10 @@
429 __cpu_number_map[boot_cpuid] = 0;
430 __cpu_logical_map[0] = boot_cpuid;
431 - current->processor = boot_cpuid;
433 smp_store_cpu_info(boot_cpuid);
434 smp_tune_scheduling(boot_cpuid);
435 smp_setup_percpu_timer(boot_cpuid);
439 /* ??? This should be in init_idle. */
440 atomic_inc(&init_mm.mm_count);
441 diff -urN linux-2.4.20/arch/arm/kernel/process.c linux-2.4.20-o1/arch/arm/kernel/process.c
442 --- linux-2.4.20/arch/arm/kernel/process.c Sat Aug 3 02:39:42 2002
443 +++ linux-2.4.20-o1/arch/arm/kernel/process.c Wed Mar 12 00:41:43 2003
446 /* endless idle loop with no priority at all */
448 - current->nice = 20;
449 - current->counter = -100;
452 void (*idle)(void) = pm_idle;
453 diff -urN linux-2.4.20/arch/cris/kernel/process.c linux-2.4.20-o1/arch/cris/kernel/process.c
454 --- linux-2.4.20/arch/cris/kernel/process.c Mon Feb 25 20:37:52 2002
455 +++ linux-2.4.20-o1/arch/cris/kernel/process.c Wed Mar 12 00:41:43 2003
456 @@ -124,10 +124,10 @@
458 int cpu_idle(void *unused)
461 - current->counter = -100;
469 /* if the watchdog is enabled, we can simply disable interrupts and go
470 diff -urN linux-2.4.20/arch/i386/kernel/apm.c linux-2.4.20-o1/arch/i386/kernel/apm.c
471 --- linux-2.4.20/arch/i386/kernel/apm.c Fri Nov 29 00:53:09 2002
472 +++ linux-2.4.20-o1/arch/i386/kernel/apm.c Wed Mar 12 00:41:43 2003
475 * [no-]debug log some debugging messages
476 * [no-]power[-_]off power off on shutdown
477 + * [no-]smp Use apm even on an SMP box
478 * bounce[-_]interval=<n> number of ticks to ignore suspend
480 * idle[-_]threshold=<n> System idle percentage above which to
482 static int got_clock_diff;
486 static int apm_disabled = -1;
488 static int power_off;
493 + * Lock APM functionality to physical CPU 0
498 +static unsigned long apm_save_cpus(void)
500 + unsigned long x = current->cpus_allowed;
501 + /* Some bioses don't like being called from CPU != 0 */
502 + if (cpu_number_map(smp_processor_id()) != 0) {
503 + set_cpus_allowed(current, 1 << cpu_logical_map(0));
504 + if (unlikely(cpu_number_map(smp_processor_id()) != 0))
510 +static inline void apm_restore_cpus(unsigned long mask)
512 + set_cpus_allowed(current, mask);
518 + * No CPU lockdown needed on a uniprocessor
521 +#define apm_save_cpus() 0
522 +#define apm_restore_cpus(x) (void)(x)
527 * These are the actual BIOS calls. Depending on APM_ZERO_SEGS and
528 * apm_info.allow_ints, we are being really paranoid here! Not only
529 * are interrupts disabled, but all the segment registers (except SS)
535 + unsigned long cpus = apm_save_cpus();
543 __restore_flags(flags);
545 + apm_restore_cpus(cpus);
554 + unsigned long cpus = apm_save_cpus();
562 __restore_flags(flags);
564 + apm_restore_cpus(cpus);
569 @@ -751,10 +796,12 @@
570 if (apm_bios_call_simple(APM_FUNC_IDLE, 0, 0, &eax)) {
571 static unsigned long t;
573 - if (time_after(jiffies, t + 10 * HZ)) {
574 + /* This always fails on some SMP boards running UP kernels.
575 + * Only report the failure the first 5 times.
578 printk(KERN_DEBUG "apm_do_idle failed (%d)\n",
584 @@ -888,17 +935,12 @@
586 * This may be called on an SMP machine.
589 - /* Some bioses don't like being called from CPU != 0 */
590 - if (cpu_number_map(smp_processor_id()) != 0) {
591 - current->cpus_allowed = 1;
593 - if (unlikely(cpu_number_map(smp_processor_id()) != 0))
597 if (apm_info.realmode_power_off)
599 + (void)apm_save_cpus();
600 machine_real_restart(po_bios_call, sizeof(po_bios_call));
601 + /* Never returns */
604 (void) set_system_power_state(APM_STATE_OFF);
606 @@ -1074,6 +1116,19 @@
608 if ((error == APM_SUCCESS) || (error == APM_NO_ERROR))
610 + if (error == APM_NOT_ENGAGED) {
613 + if (tried++ == 0) {
614 + eng_error = apm_engage_power_management(APM_DEVICE_ALL, 1);
616 + apm_error("set display", error);
617 + apm_error("engage interface", eng_error);
620 + return apm_console_blank(blank);
623 apm_error("set display", error);
626 @@ -1571,7 +1626,7 @@
630 - if ((smp_num_cpus == 1) &&
631 + if ((smp_num_cpus == 1 || smp) &&
632 !(error = apm_get_power_status(&bx, &cx, &dx))) {
633 ac_line_status = (bx >> 8) & 0xff;
634 battery_status = bx & 0xff;
635 @@ -1716,7 +1771,7 @@
640 + if (debug && (smp_num_cpus == 1 || smp )) {
641 error = apm_get_power_status(&bx, &cx, &dx);
643 printk(KERN_INFO "apm: power status not available\n");
644 @@ -1760,7 +1815,7 @@
645 pm_power_off = apm_power_off;
646 register_sysrq_key('o', &sysrq_poweroff_op);
648 - if (smp_num_cpus == 1) {
649 + if (smp_num_cpus == 1 || smp) {
650 #if defined(CONFIG_APM_DISPLAY_BLANK) && defined(CONFIG_VT)
651 console_blank_hook = apm_console_blank;
653 @@ -1799,6 +1854,11 @@
655 if (strncmp(str, "debug", 5) == 0)
657 + if (strncmp(str, "smp", 3) == 0)
660 + idle_threshold = 100;
662 if ((strncmp(str, "power-off", 9) == 0) ||
663 (strncmp(str, "power_off", 9) == 0))
665 @@ -1903,7 +1963,7 @@
666 printk(KERN_NOTICE "apm: disabled on user request.\n");
669 - if ((smp_num_cpus > 1) && !power_off) {
670 + if ((smp_num_cpus > 1) && !power_off && !smp) {
671 printk(KERN_NOTICE "apm: disabled - APM is not SMP safe.\n");
674 @@ -1957,7 +2017,7 @@
676 kernel_thread(apm, NULL, CLONE_FS | CLONE_FILES | CLONE_SIGHAND | SIGCHLD);
678 - if (smp_num_cpus > 1) {
679 + if (smp_num_cpus > 1 && !smp) {
681 "apm: disabled - APM is not SMP safe (power off active).\n");
683 @@ -2025,5 +2085,8 @@
684 MODULE_PARM(idle_period, "i");
685 MODULE_PARM_DESC(idle_period,
686 "Period (in sec/100) over which to caculate the idle percentage");
687 +MODULE_PARM(smp, "i");
688 +MODULE_PARM_DESC(smp,
689 + "Set this to enable APM use on an SMP platform. Use with caution on older systems");
692 diff -urN linux-2.4.20/arch/i386/kernel/entry.S linux-2.4.20-o1/arch/i386/kernel/entry.S
693 --- linux-2.4.20/arch/i386/kernel/entry.S Fri Nov 29 00:53:09 2002
694 +++ linux-2.4.20-o1/arch/i386/kernel/entry.S Wed Mar 12 00:41:43 2003
710 call SYMBOL_NAME(schedule_tail)
714 testb $0x02,tsk_ptrace(%ebx) # PT_TRACESYS
716 diff -urN linux-2.4.20/arch/i386/kernel/process.c linux-2.4.20-o1/arch/i386/kernel/process.c
717 --- linux-2.4.20/arch/i386/kernel/process.c Sat Aug 3 02:39:42 2002
718 +++ linux-2.4.20-o1/arch/i386/kernel/process.c Wed Mar 12 00:41:43 2003
721 if (current_cpu_data.hlt_works_ok && !hlt_counter) {
723 - if (!current->need_resched)
724 + if (!need_resched())
731 /* endless idle loop with no priority at all */
733 - current->nice = 20;
734 - current->counter = -100;
737 void (*idle)(void) = pm_idle;
738 @@ -697,15 +694,17 @@
739 asm volatile("movl %%gs,%0":"=m" (*(int *)&prev->gs));
742 - * Restore %fs and %gs.
743 + * Restore %fs and %gs if needed.
745 - loadsegment(fs, next->fs);
746 - loadsegment(gs, next->gs);
747 + if (unlikely(prev->fs | prev->gs | next->fs | next->gs)) {
748 + loadsegment(fs, next->fs);
749 + loadsegment(gs, next->gs);
753 * Now maybe reload the debug registers
755 - if (next->debugreg[7]){
756 + if (unlikely(next->debugreg[7])) {
764 - if (prev->ioperm || next->ioperm) {
765 + if (unlikely(prev->ioperm || next->ioperm)) {
768 * 4 cachelines copy ... not good, but not that
769 diff -urN linux-2.4.20/arch/i386/kernel/setup.c linux-2.4.20-o1/arch/i386/kernel/setup.c
770 --- linux-2.4.20/arch/i386/kernel/setup.c Fri Nov 29 00:53:09 2002
771 +++ linux-2.4.20-o1/arch/i386/kernel/setup.c Wed Mar 12 00:41:43 2003
772 @@ -3046,9 +3046,10 @@
777 - * Clear all 6 debug registers:
779 + /* Clear %fs and %gs. */
780 + asm volatile ("xorl %eax, %eax; movl %eax, %fs; movl %eax, %gs");
782 + /* Clear all 6 debug registers: */
784 #define CD(register) __asm__("movl %0,%%db" #register ::"r"(0) );
786 diff -urN linux-2.4.20/arch/i386/kernel/smp.c linux-2.4.20-o1/arch/i386/kernel/smp.c
787 --- linux-2.4.20/arch/i386/kernel/smp.c Fri Nov 29 00:53:09 2002
788 +++ linux-2.4.20-o1/arch/i386/kernel/smp.c Wed Mar 12 00:41:43 2003
789 @@ -493,10 +493,20 @@
790 * it goes straight through and wastes no time serializing
791 * anything. Worst case is that we lose a reschedule ...
794 void smp_send_reschedule(int cpu)
796 send_IPI_mask(1 << cpu, RESCHEDULE_VECTOR);
800 + * this function sends a reschedule IPI to all (other) CPUs.
801 + * This should only be used if some 'global' task became runnable,
802 + * such as a RT task, that must be handled now. The first CPU
803 + * that manages to grab the task will run it.
805 +void smp_send_reschedule_all(void)
807 + send_IPI_allbutself(RESCHEDULE_VECTOR);
811 diff -urN linux-2.4.20/arch/i386/kernel/smpboot.c linux-2.4.20-o1/arch/i386/kernel/smpboot.c
812 --- linux-2.4.20/arch/i386/kernel/smpboot.c Fri Nov 29 00:53:09 2002
813 +++ linux-2.4.20-o1/arch/i386/kernel/smpboot.c Wed Mar 12 00:41:43 2003
814 @@ -308,14 +308,14 @@
815 if (tsc_values[i] < avg)
816 realdelta = -realdelta;
818 - printk("BIOS BUG: CPU#%d improperly initialized, has %ld usecs TSC skew! FIXED.\n",
820 + printk("BIOS BUG: CPU#%d improperly initialized, has %ld usecs TSC skew! FIXED.\n", i, realdelta);
830 static void __init synchronize_tsc_ap (void)
832 * (This works even if the APIC is not enabled.)
834 phys_id = GET_APIC_ID(apic_read(APIC_ID));
835 - cpuid = current->processor;
837 if (test_and_set_bit(cpuid, &cpu_online_map)) {
838 printk("huh, phys CPU#%d, CPU#%d already present??\n",
842 smp_store_cpu_info(cpuid);
844 + disable_APIC_timer();
846 * Allow the master to continue.
850 while (!atomic_read(&smp_commenced))
852 + enable_APIC_timer();
854 * low-memory mappings have been cleared, flush them from
855 * the local TLBs too.
856 @@ -803,16 +805,13 @@
858 panic("No idle process for CPU %d", cpu);
860 - idle->processor = cpu;
861 - idle->cpus_runnable = 1 << cpu; /* we schedule the first task manually */
862 + init_idle(idle, cpu);
864 map_cpu_to_boot_apicid(cpu, apicid);
866 idle->thread.eip = (unsigned long) start_secondary;
868 - del_from_runqueue(idle);
869 unhash_process(idle);
870 - init_tasks[cpu] = idle;
872 /* start_eip had better be page-aligned! */
873 start_eip = setup_trampoline();
877 cycles_t cacheflush_time;
878 +unsigned long cache_decay_ticks;
880 static void smp_tune_scheduling (void)
883 cacheflush_time = (cpu_khz>>10) * (cachesize<<10) / bandwidth;
886 + cache_decay_ticks = (long)cacheflush_time/cpu_khz * HZ / 1000;
888 printk("per-CPU timeslice cutoff: %ld.%02ld usecs.\n",
889 (long)cacheflush_time/(cpu_khz/1000),
890 ((long)cacheflush_time*100/(cpu_khz/1000)) % 100);
891 + printk("task migration cache decay timeout: %ld msecs.\n",
892 + (cache_decay_ticks + 1) * 1000 / HZ);
896 @@ -1023,8 +1027,7 @@
897 map_cpu_to_boot_apicid(0, boot_cpu_apicid);
899 global_irq_holder = 0;
900 - current->processor = 0;
903 smp_tune_scheduling();
906 diff -urN linux-2.4.20/arch/mips64/kernel/process.c linux-2.4.20-o1/arch/mips64/kernel/process.c
907 --- linux-2.4.20/arch/mips64/kernel/process.c Fri Nov 29 00:53:10 2002
908 +++ linux-2.4.20-o1/arch/mips64/kernel/process.c Wed Mar 12 00:41:43 2003
911 /* endless idle loop with no priority at all */
913 - current->nice = 20;
914 - current->counter = -100;
917 while (!current->need_resched)
919 diff -urN linux-2.4.20/arch/parisc/kernel/process.c linux-2.4.20-o1/arch/parisc/kernel/process.c
920 --- linux-2.4.20/arch/parisc/kernel/process.c Fri Nov 29 00:53:10 2002
921 +++ linux-2.4.20-o1/arch/parisc/kernel/process.c Wed Mar 12 00:41:43 2003
924 /* endless idle loop with no priority at all */
926 - current->nice = 20;
927 - current->counter = -100;
930 while (!current->need_resched) {
931 diff -urN linux-2.4.20/arch/ppc/8260_io/uart.c linux-2.4.20-o1/arch/ppc/8260_io/uart.c
932 --- linux-2.4.20/arch/ppc/8260_io/uart.c Sat Aug 3 02:39:43 2002
933 +++ linux-2.4.20-o1/arch/ppc/8260_io/uart.c Wed Mar 12 00:41:43 2003
934 @@ -1732,7 +1732,6 @@
935 printk("lsr = %d (jiff=%lu)...", lsr, jiffies);
937 current->state = TASK_INTERRUPTIBLE;
938 -/* current->counter = 0; make us low-priority */
939 schedule_timeout(char_time);
940 if (signal_pending(current))
942 diff -urN linux-2.4.20/arch/ppc/8xx_io/uart.c linux-2.4.20-o1/arch/ppc/8xx_io/uart.c
943 --- linux-2.4.20/arch/ppc/8xx_io/uart.c Sat Aug 3 02:39:43 2002
944 +++ linux-2.4.20-o1/arch/ppc/8xx_io/uart.c Wed Mar 12 00:41:43 2003
945 @@ -1796,7 +1796,6 @@
946 printk("lsr = %d (jiff=%lu)...", lsr, jiffies);
948 current->state = TASK_INTERRUPTIBLE;
949 -/* current->counter = 0; make us low-priority */
950 schedule_timeout(char_time);
951 if (signal_pending(current))
953 diff -urN linux-2.4.20/arch/ppc/kernel/entry.S linux-2.4.20-o1/arch/ppc/kernel/entry.S
954 --- linux-2.4.20/arch/ppc/kernel/entry.S Fri Nov 29 00:53:11 2002
955 +++ linux-2.4.20-o1/arch/ppc/kernel/entry.S Wed Mar 12 00:41:43 2003
963 lwz r0,TASK_PTRACE(r2)
964 andi. r0,r0,PT_TRACESYS
966 diff -urN linux-2.4.20/arch/ppc/kernel/idle.c linux-2.4.20-o1/arch/ppc/kernel/idle.c
967 --- linux-2.4.20/arch/ppc/kernel/idle.c Fri Nov 29 00:53:11 2002
968 +++ linux-2.4.20-o1/arch/ppc/kernel/idle.c Wed Mar 12 00:41:43 2003
972 /* endless loop with no priority at all */
973 - current->nice = 20;
974 - current->counter = -100;
979 if (!do_power_save) {
980 diff -urN linux-2.4.20/arch/ppc/kernel/mk_defs.c linux-2.4.20-o1/arch/ppc/kernel/mk_defs.c
981 --- linux-2.4.20/arch/ppc/kernel/mk_defs.c Tue Aug 28 15:58:33 2001
982 +++ linux-2.4.20-o1/arch/ppc/kernel/mk_defs.c Wed Mar 12 00:41:43 2003
984 /*DEFINE(KERNELBASE, KERNELBASE);*/
985 DEFINE(STATE, offsetof(struct task_struct, state));
986 DEFINE(NEXT_TASK, offsetof(struct task_struct, next_task));
987 - DEFINE(COUNTER, offsetof(struct task_struct, counter));
988 - DEFINE(PROCESSOR, offsetof(struct task_struct, processor));
989 + DEFINE(COUNTER, offsetof(struct task_struct, time_slice));
990 + DEFINE(PROCESSOR, offsetof(struct task_struct, cpu));
991 DEFINE(SIGPENDING, offsetof(struct task_struct, sigpending));
992 DEFINE(THREAD, offsetof(struct task_struct, thread));
993 DEFINE(MM, offsetof(struct task_struct, mm));
994 diff -urN linux-2.4.20/arch/ppc/kernel/process.c linux-2.4.20-o1/arch/ppc/kernel/process.c
995 --- linux-2.4.20/arch/ppc/kernel/process.c Mon Nov 26 14:29:17 2001
996 +++ linux-2.4.20-o1/arch/ppc/kernel/process.c Wed Mar 12 00:41:43 2003
1001 - printk(" CPU: %d", current->processor);
1002 + printk(" CPU: %d", current->cpu);
1003 #endif /* CONFIG_SMP */
1006 diff -urN linux-2.4.20/arch/ppc/kernel/smp.c linux-2.4.20-o1/arch/ppc/kernel/smp.c
1007 --- linux-2.4.20/arch/ppc/kernel/smp.c Sat Aug 3 02:39:43 2002
1008 +++ linux-2.4.20-o1/arch/ppc/kernel/smp.c Wed Mar 12 00:41:43 2003
1010 unsigned long cpu_online_map;
1011 int smp_hw_index[NR_CPUS];
1012 static struct smp_ops_t *smp_ops;
1013 +unsigned long cache_decay_ticks = HZ/100;
1015 /* all cpu mappings are 1-1 -- Cort */
1016 volatile unsigned long cpu_callin_map[NR_CPUS];
1018 * cpu 0, the master -- Cort
1020 cpu_callin_map[0] = 1;
1021 - current->processor = 0;
1026 for (i = 0; i < NR_CPUS; i++) {
1027 prof_counter[i] = 1;
1028 @@ -348,12 +347,9 @@
1029 p = init_task.prev_task;
1031 panic("No idle task for CPU %d", i);
1032 - del_from_runqueue(p);
1035 - init_tasks[i] = p;
1038 - p->cpus_runnable = 1 << i; /* we schedule the first task manually */
1044 void __init smp_callin(void)
1046 - int cpu = current->processor;
1047 + int cpu = current->cpu;
1049 smp_store_cpu_info(cpu);
1050 set_dec(tb_ticks_per_jiffy);
1051 diff -urN linux-2.4.20/arch/ppc/lib/dec_and_lock.c linux-2.4.20-o1/arch/ppc/lib/dec_and_lock.c
1052 --- linux-2.4.20/arch/ppc/lib/dec_and_lock.c Fri Nov 16 19:10:08 2001
1053 +++ linux-2.4.20-o1/arch/ppc/lib/dec_and_lock.c Wed Mar 12 00:41:43 2003
1055 #include <linux/module.h>
1056 +#include <linux/sched.h>
1057 #include <linux/spinlock.h>
1058 #include <asm/atomic.h>
1059 #include <asm/system.h>
1060 diff -urN linux-2.4.20/arch/ppc/mm/init.c linux-2.4.20-o1/arch/ppc/mm/init.c
1061 --- linux-2.4.20/arch/ppc/mm/init.c Sat Aug 3 02:39:43 2002
1062 +++ linux-2.4.20-o1/arch/ppc/mm/init.c Wed Mar 12 00:41:43 2003
1067 - printk("%3d ", p->processor);
1068 - if ( (p->processor != NO_PROC_ID) &&
1069 - (p == current_set[p->processor]) )
1070 + printk("%3d ", p->cpu);
1071 + if ( (p->cpu != NO_PROC_ID) &&
1072 + (p == current_set[p->cpu]) )
1076 diff -urN linux-2.4.20/arch/ppc64/kernel/entry.S linux-2.4.20-o1/arch/ppc64/kernel/entry.S
1077 --- linux-2.4.20/arch/ppc64/kernel/entry.S Fri Nov 29 00:53:11 2002
1078 +++ linux-2.4.20-o1/arch/ppc64/kernel/entry.S Wed Mar 12 00:41:43 2003
1082 _GLOBAL(ret_from_fork)
1086 ld r0,TASK_PTRACE(r13)
1087 andi. r0,r0,PT_TRACESYS
1088 beq+ .ret_from_except
1089 diff -urN linux-2.4.20/arch/ppc64/kernel/idle.c linux-2.4.20-o1/arch/ppc64/kernel/idle.c
1090 --- linux-2.4.20/arch/ppc64/kernel/idle.c Sat Aug 3 02:39:43 2002
1091 +++ linux-2.4.20-o1/arch/ppc64/kernel/idle.c Wed Mar 12 00:41:43 2003
1096 - /* endless loop with no priority at all */
1097 - current->nice = 20;
1098 - current->counter = -100;
1099 #ifdef CONFIG_PPC_ISERIES
1100 /* ensure iSeries run light will be out when idle */
1101 current->thread.flags &= ~PPC_FLAG_RUN_LIGHT;
1107 + /* endless loop with no priority at all */
1111 diff -urN linux-2.4.20/arch/ppc64/kernel/process.c linux-2.4.20-o1/arch/ppc64/kernel/process.c
1112 --- linux-2.4.20/arch/ppc64/kernel/process.c Fri Nov 29 00:53:11 2002
1113 +++ linux-2.4.20-o1/arch/ppc64/kernel/process.c Wed Mar 12 00:41:43 2003
1115 #ifdef SHOW_TASK_SWITCHES
1116 printk("%s/%d -> %s/%d NIP %08lx cpu %d root %x/%x\n",
1117 prev->comm,prev->pid,
1118 - new->comm,new->pid,new->thread.regs->nip,new->processor,
1119 + new->comm,new->pid,new->thread.regs->nip,new->cpu,
1120 new->fs->root,prev->fs->root);
1123 diff -urN linux-2.4.20/arch/ppc64/kernel/smp.c linux-2.4.20-o1/arch/ppc64/kernel/smp.c
1124 --- linux-2.4.20/arch/ppc64/kernel/smp.c Fri Nov 29 00:53:11 2002
1125 +++ linux-2.4.20-o1/arch/ppc64/kernel/smp.c Wed Mar 12 00:41:43 2003
1127 extern atomic_t ipi_sent;
1128 spinlock_t kernel_flag __cacheline_aligned = SPIN_LOCK_UNLOCKED;
1129 cycles_t cacheflush_time;
1130 +unsigned long cache_decay_ticks = HZ/100;
1131 static int max_cpus __initdata = NR_CPUS;
1133 unsigned long cpu_online_map;
1135 * cpu 0, the master -- Cort
1137 cpu_callin_map[0] = 1;
1138 - current->processor = 0;
1143 for (i = 0; i < NR_CPUS; i++) {
1144 paca[i].prof_counter = 1;
1145 @@ -684,12 +683,9 @@
1147 PPCDBG(PPCDBG_SMP,"\tProcessor %d, task = 0x%lx\n", i, p);
1149 - del_from_runqueue(p);
1152 - init_tasks[i] = p;
1155 - p->cpus_runnable = 1 << i; /* we schedule the first task manually */
1156 current_set[i].task = p;
1157 sp = ((unsigned long)p) + sizeof(union task_union)
1158 - STACK_FRAME_OVERHEAD;
1161 void __init smp_callin(void)
1163 - int cpu = current->processor;
1164 + int cpu = current->cpu;
1166 smp_store_cpu_info(cpu);
1167 set_dec(paca[cpu].default_decr);
1170 ppc_md.smp_setup_cpu(cpu);
1174 set_bit(smp_processor_id(), &cpu_online_map);
1176 while(!smp_commenced) {
1181 - cpu = current->processor;
1182 + cpu = current->cpu;
1183 atomic_inc(&init_mm.mm_count);
1184 current->active_mm = &init_mm;
1186 diff -urN linux-2.4.20/arch/s390/kernel/process.c linux-2.4.20-o1/arch/s390/kernel/process.c
1187 --- linux-2.4.20/arch/s390/kernel/process.c Sat Aug 3 02:39:43 2002
1188 +++ linux-2.4.20-o1/arch/s390/kernel/process.c Wed Mar 12 00:41:43 2003
1191 /* endless idle loop with no priority at all */
1193 - current->nice = 20;
1194 - current->counter = -100;
1197 if (current->need_resched) {
1199 diff -urN linux-2.4.20/arch/s390x/kernel/process.c linux-2.4.20-o1/arch/s390x/kernel/process.c
1200 --- linux-2.4.20/arch/s390x/kernel/process.c Fri Nov 29 00:53:11 2002
1201 +++ linux-2.4.20-o1/arch/s390x/kernel/process.c Wed Mar 12 00:41:43 2003
1204 /* endless idle loop with no priority at all */
1206 - current->nice = 20;
1207 - current->counter = -100;
1210 if (current->need_resched) {
1212 diff -urN linux-2.4.20/arch/sh/kernel/process.c linux-2.4.20-o1/arch/sh/kernel/process.c
1213 --- linux-2.4.20/arch/sh/kernel/process.c Mon Oct 15 22:36:48 2001
1214 +++ linux-2.4.20-o1/arch/sh/kernel/process.c Wed Mar 12 00:41:43 2003
1217 /* endless idle loop with no priority at all */
1219 - current->nice = 20;
1220 - current->counter = -100;
1224 diff -urN linux-2.4.20/arch/sparc/kernel/entry.S linux-2.4.20-o1/arch/sparc/kernel/entry.S
1225 --- linux-2.4.20/arch/sparc/kernel/entry.S Tue Nov 13 18:16:05 2001
1226 +++ linux-2.4.20-o1/arch/sparc/kernel/entry.S Wed Mar 12 00:46:06 2003
1227 @@ -1463,7 +1463,9 @@
1229 .globl C_LABEL(ret_from_fork)
1230 C_LABEL(ret_from_fork):
1235 b C_LABEL(ret_sys_call)
1236 ld [%sp + REGWIN_SZ + PT_I0], %o0
1237 diff -urN linux-2.4.20/arch/sparc/kernel/process.c linux-2.4.20-o1/arch/sparc/kernel/process.c
1238 --- linux-2.4.20/arch/sparc/kernel/process.c Sat Aug 3 02:39:43 2002
1239 +++ linux-2.4.20-o1/arch/sparc/kernel/process.c Wed Mar 12 00:41:43 2003
1243 /* endless idle loop with no priority at all */
1244 - current->nice = 20;
1245 - current->counter = -100;
1249 if (ARCH_SUN4C_SUN4) {
1253 /* endless idle loop with no priority at all */
1254 - current->nice = 20;
1255 - current->counter = -100;
1259 if(current->need_resched) {
1260 diff -urN linux-2.4.20/arch/sparc/kernel/smp.c linux-2.4.20-o1/arch/sparc/kernel/smp.c
1261 --- linux-2.4.20/arch/sparc/kernel/smp.c Fri Dec 21 18:41:53 2001
1262 +++ linux-2.4.20-o1/arch/sparc/kernel/smp.c Wed Mar 12 00:41:43 2003
1264 volatile int __cpu_number_map[NR_CPUS];
1265 volatile int __cpu_logical_map[NR_CPUS];
1266 cycles_t cacheflush_time = 0; /* XXX */
1267 +unsigned long cache_decay_ticks = HZ/100; /* XXX */
1269 /* The only guaranteed locking primitive available on all Sparc
1270 * processors is 'ldstub [%reg + immediate], %dest_reg' which atomically
1271 diff -urN linux-2.4.20/arch/sparc/kernel/sun4d_smp.c linux-2.4.20-o1/arch/sparc/kernel/sun4d_smp.c
1272 --- linux-2.4.20/arch/sparc/kernel/sun4d_smp.c Sat Aug 3 02:39:43 2002
1273 +++ linux-2.4.20-o1/arch/sparc/kernel/sun4d_smp.c Wed Mar 12 00:41:43 2003
1275 * the SMP initialization the master will be just allowed
1276 * to call the scheduler code.
1280 /* Get our local ticker going. */
1281 smp_setup_percpu_timer();
1283 while((unsigned long)current_set[cpuid] < PAGE_OFFSET)
1286 - while(current_set[cpuid]->processor != cpuid)
1287 + while(current_set[cpuid]->cpu != cpuid)
1290 /* Fix idle thread fields. */
1291 @@ -197,10 +196,8 @@
1293 __cpu_number_map[boot_cpu_id] = 0;
1294 __cpu_logical_map[0] = boot_cpu_id;
1295 - current->processor = boot_cpu_id;
1296 smp_store_cpu_info(boot_cpu_id);
1297 smp_setup_percpu_timer();
1299 local_flush_cache_all();
1300 if(linux_num_cpus == 1)
1301 return; /* Not an MP box. */
1302 @@ -222,14 +219,10 @@
1305 p = init_task.prev_task;
1306 - init_tasks[i] = p;
1309 - p->cpus_runnable = 1 << i; /* we schedule the first task manually */
1313 - del_from_runqueue(p);
1317 for (no = 0; no < linux_num_cpus; no++)
1318 diff -urN linux-2.4.20/arch/sparc/kernel/sun4m_smp.c linux-2.4.20-o1/arch/sparc/kernel/sun4m_smp.c
1319 --- linux-2.4.20/arch/sparc/kernel/sun4m_smp.c Wed Nov 21 19:31:09 2001
1320 +++ linux-2.4.20-o1/arch/sparc/kernel/sun4m_smp.c Wed Mar 12 00:41:43 2003
1322 * the SMP initialization the master will be just allowed
1323 * to call the scheduler code.
1327 /* Allow master to continue. */
1328 swap((unsigned long *)&cpu_callin_map[cpuid], 1);
1329 @@ -170,12 +169,10 @@
1330 mid_xlate[boot_cpu_id] = (linux_cpus[boot_cpu_id].mid & ~8);
1331 __cpu_number_map[boot_cpu_id] = 0;
1332 __cpu_logical_map[0] = boot_cpu_id;
1333 - current->processor = boot_cpu_id;
1335 smp_store_cpu_info(boot_cpu_id);
1336 set_irq_udt(mid_xlate[boot_cpu_id]);
1337 smp_setup_percpu_timer();
1339 local_flush_cache_all();
1340 if(linux_num_cpus == 1)
1341 return; /* Not an MP box. */
1342 @@ -195,14 +192,10 @@
1345 p = init_task.prev_task;
1346 - init_tasks[i] = p;
1349 - p->cpus_runnable = 1 << i; /* we schedule the first task manually */
1353 - del_from_runqueue(p);
1357 /* See trampoline.S for details... */
1358 diff -urN linux-2.4.20/arch/sparc64/kernel/entry.S linux-2.4.20-o1/arch/sparc64/kernel/entry.S
1359 --- linux-2.4.20/arch/sparc64/kernel/entry.S Fri Nov 29 00:53:12 2002
1360 +++ linux-2.4.20-o1/arch/sparc64/kernel/entry.S Wed Mar 12 00:46:53 2003
1361 @@ -1619,7 +1619,9 @@
1363 andn %o7, SPARC_FLAG_NEWCHILD, %l0
1364 mov %g5, %o0 /* 'prev' */
1368 stb %l0, [%g6 + AOFF_task_thread + AOFF_thread_flags]
1369 andcc %l0, SPARC_FLAG_PERFCTR, %g0
1371 diff -urN linux-2.4.20/arch/sparc64/kernel/irq.c linux-2.4.20-o1/arch/sparc64/kernel/irq.c
1372 --- linux-2.4.20/arch/sparc64/kernel/irq.c Fri Nov 29 00:53:12 2002
1373 +++ linux-2.4.20-o1/arch/sparc64/kernel/irq.c Wed Mar 12 00:41:43 2003
1375 tid = ((tid & UPA_CONFIG_MID) << 9);
1376 tid &= IMAP_TID_UPA;
1378 - tid = (starfire_translate(imap, current->processor) << 26);
1379 + tid = (starfire_translate(imap, current->cpu) << 26);
1380 tid &= IMAP_TID_UPA;
1383 diff -urN linux-2.4.20/arch/sparc64/kernel/process.c linux-2.4.20-o1/arch/sparc64/kernel/process.c
1384 --- linux-2.4.20/arch/sparc64/kernel/process.c Fri Nov 29 00:53:12 2002
1385 +++ linux-2.4.20-o1/arch/sparc64/kernel/process.c Wed Mar 12 00:41:43 2003
1389 /* endless idle loop with no priority at all */
1390 - current->nice = 20;
1391 - current->counter = -100;
1395 /* If current->need_resched is zero we should really
1398 * the idle loop on a UltraMultiPenguin...
1400 -#define idle_me_harder() (cpu_data[current->processor].idle_volume += 1)
1401 -#define unidle_me() (cpu_data[current->processor].idle_volume = 0)
1402 +#define idle_me_harder() (cpu_data[current->cpu].idle_volume += 1)
1403 +#define unidle_me() (cpu_data[current->cpu].idle_volume = 0)
1406 - current->nice = 20;
1407 - current->counter = -100;
1411 if (current->need_resched != 0) {
1413 diff -urN linux-2.4.20/arch/sparc64/kernel/smp.c linux-2.4.20-o1/arch/sparc64/kernel/smp.c
1414 --- linux-2.4.20/arch/sparc64/kernel/smp.c Fri Nov 29 00:53:12 2002
1415 +++ linux-2.4.20-o1/arch/sparc64/kernel/smp.c Wed Mar 12 00:41:43 2003
1417 printk("Entering UltraSMPenguin Mode...\n");
1419 smp_store_cpu_info(boot_cpu_id);
1422 if (linux_num_cpus == 1)
1424 @@ -282,12 +281,8 @@
1427 p = init_task.prev_task;
1428 - init_tasks[cpucount] = p;
1431 - p->cpus_runnable = 1UL << i; /* we schedule the first task manually */
1433 - del_from_runqueue(p);
1438 @@ -1154,8 +1149,113 @@
1439 __cpu_number_map[boot_cpu_id] = 0;
1440 prom_cpu_nodes[boot_cpu_id] = linux_cpus[0].prom_node;
1441 __cpu_logical_map[0] = boot_cpu_id;
1442 - current->processor = boot_cpu_id;
1443 prof_counter(boot_cpu_id) = prof_multiplier(boot_cpu_id) = 1;
1446 +cycles_t cacheflush_time;
1447 +unsigned long cache_decay_ticks;
1449 +extern unsigned long cheetah_tune_scheduling(void);
1451 +static void __init smp_tune_scheduling(void)
1453 + unsigned long orig_flush_base, flush_base, flags, *p;
1454 + unsigned int ecache_size, order;
1455 + cycles_t tick1, tick2, raw;
1457 + /* Approximate heuristic for SMP scheduling. It is an
1458 + * estimation of the time it takes to flush the L2 cache
1459 + * on the local processor.
1461 + * The ia32 chooses to use the L1 cache flush time instead,
1462 + * and I consider this complete nonsense. The Ultra can service
1463 + * a miss to the L1 with a hit to the L2 in 7 or 8 cycles, and
1464 + * L2 misses are what create extra bus traffic (ie. the "cost"
1465 + * of moving a process from one cpu to another).
1467 + printk("SMP: Calibrating ecache flush... ");
1468 + if (tlb_type == cheetah || tlb_type == cheetah_plus) {
1469 + cacheflush_time = cheetah_tune_scheduling();
1473 + ecache_size = prom_getintdefault(linux_cpus[0].prom_node,
1474 + "ecache-size", (512 * 1024));
1475 + if (ecache_size > (4 * 1024 * 1024))
1476 + ecache_size = (4 * 1024 * 1024);
1477 + orig_flush_base = flush_base =
1478 + __get_free_pages(GFP_KERNEL, order = get_order(ecache_size));
1480 + if (flush_base != 0UL) {
1481 + local_irq_save(flags);
1483 + /* Scan twice the size once just to get the TLB entries
1484 + * loaded and make sure the second scan measures pure misses.
1486 + for (p = (unsigned long *)flush_base;
1487 + ((unsigned long)p) < (flush_base + (ecache_size<<1));
1488 + p += (64 / sizeof(unsigned long)))
1489 + *((volatile unsigned long *)p);
1491 + /* Now the real measurement. */
1492 + if (!SPARC64_USE_STICK) {
1493 + __asm__ __volatile__("b,pt %%xcc, 1f\n\t"
1494 + " rd %%tick, %0\n\t"
1496 + "1:\tldx [%2 + 0x000], %%g1\n\t"
1497 + "ldx [%2 + 0x040], %%g2\n\t"
1498 + "ldx [%2 + 0x080], %%g3\n\t"
1499 + "ldx [%2 + 0x0c0], %%g5\n\t"
1500 + "add %2, 0x100, %2\n\t"
1502 + "bne,pt %%xcc, 1b\n\t"
1504 + "rd %%tick, %1\n\t"
1505 + : "=&r" (tick1), "=&r" (tick2),
1506 + "=&r" (flush_base)
1507 + : "2" (flush_base),
1508 + "r" (flush_base + ecache_size)
1509 + : "g1", "g2", "g3", "g5");
1511 + __asm__ __volatile__("b,pt %%xcc, 1f\n\t"
1512 + " rd %%asr24, %0\n\t"
1514 + "1:\tldx [%2 + 0x000], %%g1\n\t"
1515 + "ldx [%2 + 0x040], %%g2\n\t"
1516 + "ldx [%2 + 0x080], %%g3\n\t"
1517 + "ldx [%2 + 0x0c0], %%g5\n\t"
1518 + "add %2, 0x100, %2\n\t"
1520 + "bne,pt %%xcc, 1b\n\t"
1522 + "rd %%asr24, %1\n\t"
1523 + : "=&r" (tick1), "=&r" (tick2),
1524 + "=&r" (flush_base)
1525 + : "2" (flush_base),
1526 + "r" (flush_base + ecache_size)
1527 + : "g1", "g2", "g3", "g5");
1530 + local_irq_restore(flags);
1532 + raw = (tick2 - tick1);
1534 + /* Dampen it a little, considering two processes
1535 + * sharing the cache and fitting.
1537 + cacheflush_time = (raw - (raw >> 2));
1539 + free_pages(orig_flush_base, order);
1541 + cacheflush_time = ((ecache_size << 2) +
1542 + (ecache_size << 1));
1545 + /* Convert ticks/sticks to jiffies. */
1546 + cache_decay_ticks = cacheflush_time / timer_tick_offset;
1548 + printk("Using heuristic of %ld cycles, %ld ticks.\n",
1549 + cacheflush_time, cache_decay_ticks);
1552 static inline unsigned long find_flush_base(unsigned long size)
1553 diff -urN linux-2.4.20/arch/sparc64/kernel/traps.c linux-2.4.20-o1/arch/sparc64/kernel/traps.c
1554 --- linux-2.4.20/arch/sparc64/kernel/traps.c Fri Nov 29 00:53:12 2002
1555 +++ linux-2.4.20-o1/arch/sparc64/kernel/traps.c Wed Mar 12 00:41:43 2003
1556 @@ -570,6 +570,48 @@
1557 "i" (ASI_PHYS_USE_EC));
1561 +unsigned long __init cheetah_tune_scheduling(void)
1563 + unsigned long tick1, tick2, raw;
1564 + unsigned long flush_base = ecache_flush_physbase;
1565 + unsigned long flush_linesize = ecache_flush_linesize;
1566 + unsigned long flush_size = ecache_flush_size;
1568 + /* Run through the whole cache to guarentee the timed loop
1569 + * is really displacing cache lines.
1571 + __asm__ __volatile__("1: subcc %0, %4, %0\n\t"
1572 + " bne,pt %%xcc, 1b\n\t"
1573 + " ldxa [%2 + %0] %3, %%g0\n\t"
1574 + : "=&r" (flush_size)
1575 + : "0" (flush_size), "r" (flush_base),
1576 + "i" (ASI_PHYS_USE_EC), "r" (flush_linesize));
1578 + /* The flush area is 2 X Ecache-size, so cut this in half for
1581 + flush_base = ecache_flush_physbase;
1582 + flush_linesize = ecache_flush_linesize;
1583 + flush_size = ecache_flush_size >> 1;
1585 + __asm__ __volatile__("rd %%tick, %0" : "=r" (tick1));
1587 + __asm__ __volatile__("1: subcc %0, %4, %0\n\t"
1588 + " bne,pt %%xcc, 1b\n\t"
1589 + " ldxa [%2 + %0] %3, %%g0\n\t"
1590 + : "=&r" (flush_size)
1591 + : "0" (flush_size), "r" (flush_base),
1592 + "i" (ASI_PHYS_USE_EC), "r" (flush_linesize));
1594 + __asm__ __volatile__("rd %%tick, %0" : "=r" (tick2));
1596 + raw = (tick2 - tick1);
1598 + return (raw - (raw >> 2));
1602 /* Unfortunately, the diagnostic access to the I-cache tags we need to
1603 * use to clear the thing interferes with I-cache coherency transactions.
1605 diff -urN linux-2.4.20/drivers/block/loop.c linux-2.4.20-o1/drivers/block/loop.c
1606 --- linux-2.4.20/drivers/block/loop.c Fri Nov 29 00:53:12 2002
1607 +++ linux-2.4.20-o1/drivers/block/loop.c Wed Mar 12 00:41:43 2003
1609 flush_signals(current);
1610 spin_unlock_irq(¤t->sigmask_lock);
1612 - current->policy = SCHED_OTHER;
1613 - current->nice = -20;
1615 spin_lock_irq(&lo->lo_lock);
1616 lo->lo_state = Lo_bound;
1617 atomic_inc(&lo->lo_pending);
1618 diff -urN linux-2.4.20/drivers/char/drm-4.0/tdfx_drv.c linux-2.4.20-o1/drivers/char/drm-4.0/tdfx_drv.c
1619 --- linux-2.4.20/drivers/char/drm-4.0/tdfx_drv.c Fri Nov 29 00:53:12 2002
1620 +++ linux-2.4.20-o1/drivers/char/drm-4.0/tdfx_drv.c Wed Mar 12 00:41:43 2003
1622 lock.context, current->pid, j,
1623 dev->lock.lock_time, jiffies);
1624 current->state = TASK_INTERRUPTIBLE;
1625 - current->policy |= SCHED_YIELD;
1626 schedule_timeout(DRM_LOCK_SLICE-j);
1627 DRM_DEBUG("jiffies=%d\n", jiffies);
1629 diff -urN linux-2.4.20/drivers/char/mwave/mwavedd.c linux-2.4.20-o1/drivers/char/mwave/mwavedd.c
1630 --- linux-2.4.20/drivers/char/mwave/mwavedd.c Mon Feb 25 20:37:57 2002
1631 +++ linux-2.4.20-o1/drivers/char/mwave/mwavedd.c Wed Mar 12 00:41:43 2003
1633 pDrvData->IPCs[ipcnum].bIsHere = FALSE;
1634 pDrvData->IPCs[ipcnum].bIsEnabled = TRUE;
1635 #if LINUX_VERSION_CODE >= KERNEL_VERSION(2,4,0)
1636 - current->nice = -20; /* boost to provide priority timing */
1638 current->priority = 0x28; /* boost to provide priority timing */
1640 diff -urN linux-2.4.20/drivers/char/serial_txx927.c linux-2.4.20-o1/drivers/char/serial_txx927.c
1641 --- linux-2.4.20/drivers/char/serial_txx927.c Sat Aug 3 02:39:43 2002
1642 +++ linux-2.4.20-o1/drivers/char/serial_txx927.c Wed Mar 12 00:41:43 2003
1643 @@ -1533,7 +1533,6 @@
1644 printk("cisr = %d (jiff=%lu)...", cisr, jiffies);
1646 current->state = TASK_INTERRUPTIBLE;
1647 - current->counter = 0; /* make us low-priority */
1648 schedule_timeout(char_time);
1649 if (signal_pending(current))
1651 diff -urN linux-2.4.20/drivers/md/md.c linux-2.4.20-o1/drivers/md/md.c
1652 --- linux-2.4.20/drivers/md/md.c Fri Nov 29 00:53:13 2002
1653 +++ linux-2.4.20-o1/drivers/md/md.c Wed Mar 12 00:41:43 2003
1654 @@ -2936,8 +2936,6 @@
1655 * bdflush, otherwise bdflush will deadlock if there are too
1656 * many dirty RAID5 blocks.
1658 - current->policy = SCHED_OTHER;
1659 - current->nice = -20;
1662 complete(thread->event);
1663 @@ -3391,11 +3389,6 @@
1664 "(but not more than %d KB/sec) for reconstruction.\n",
1665 sysctl_speed_limit_max);
1668 - * Resync has low priority.
1670 - current->nice = 19;
1672 is_mddev_idle(mddev); /* this also initializes IO event counters */
1673 for (m = 0; m < SYNC_MARKS; m++) {
1675 @@ -3473,16 +3466,13 @@
1676 currspeed = (j-mddev->resync_mark_cnt)/2/((jiffies-mddev->resync_mark)/HZ +1) +1;
1678 if (currspeed > sysctl_speed_limit_min) {
1679 - current->nice = 19;
1681 if ((currspeed > sysctl_speed_limit_max) ||
1682 !is_mddev_idle(mddev)) {
1683 current->state = TASK_INTERRUPTIBLE;
1684 md_schedule_timeout(HZ/4);
1688 - current->nice = -20;
1691 printk(KERN_INFO "md: md%d: sync done.\n",mdidx(mddev));
1693 diff -urN linux-2.4.20/fs/binfmt_elf.c linux-2.4.20-o1/fs/binfmt_elf.c
1694 --- linux-2.4.20/fs/binfmt_elf.c Sat Aug 3 02:39:45 2002
1695 +++ linux-2.4.20-o1/fs/binfmt_elf.c Wed Mar 12 00:41:43 2003
1696 @@ -1143,7 +1143,7 @@
1697 psinfo.pr_state = i;
1698 psinfo.pr_sname = (i < 0 || i > 5) ? '.' : "RSDZTD"[i];
1699 psinfo.pr_zomb = psinfo.pr_sname == 'Z';
1700 - psinfo.pr_nice = current->nice;
1701 + psinfo.pr_nice = task_nice(current);
1702 psinfo.pr_flag = current->flags;
1703 psinfo.pr_uid = NEW_TO_OLD_UID(current->uid);
1704 psinfo.pr_gid = NEW_TO_OLD_GID(current->gid);
1705 diff -urN linux-2.4.20/fs/jffs2/background.c linux-2.4.20-o1/fs/jffs2/background.c
1706 --- linux-2.4.20/fs/jffs2/background.c Thu Oct 25 09:07:09 2001
1707 +++ linux-2.4.20-o1/fs/jffs2/background.c Wed Mar 12 00:41:43 2003
1710 sprintf(current->comm, "jffs2_gcd_mtd%d", c->mtd->index);
1712 - /* FIXME in the 2.2 backport */
1713 - current->nice = 10;
1716 spin_lock_irq(¤t->sigmask_lock);
1717 siginitsetinv (¤t->blocked, sigmask(SIGHUP) | sigmask(SIGKILL) | sigmask(SIGSTOP) | sigmask(SIGCONT));
1718 diff -urN linux-2.4.20/fs/proc/array.c linux-2.4.20-o1/fs/proc/array.c
1719 --- linux-2.4.20/fs/proc/array.c Sat Aug 3 02:39:45 2002
1720 +++ linux-2.4.20-o1/fs/proc/array.c Wed Mar 12 00:41:43 2003
1723 /* scale priority and nice values from timeslices to -20..20 */
1724 /* to make it look like a "normal" Unix priority/nice value */
1725 - priority = task->counter;
1726 - priority = 20 - (priority * 10 + DEF_COUNTER / 2) / DEF_COUNTER;
1727 - nice = task->nice;
1728 + priority = task_prio(task);
1729 + nice = task_nice(task);
1731 read_lock(&tasklist_lock);
1732 ppid = task->pid ? task->p_opptr->pid : 0;
1742 diff -urN linux-2.4.20/fs/proc/proc_misc.c linux-2.4.20-o1/fs/proc/proc_misc.c
1743 --- linux-2.4.20/fs/proc/proc_misc.c Fri Nov 29 00:53:15 2002
1744 +++ linux-2.4.20-o1/fs/proc/proc_misc.c Wed Mar 12 00:41:43 2003
1745 @@ -106,11 +106,11 @@
1746 a = avenrun[0] + (FIXED_1/200);
1747 b = avenrun[1] + (FIXED_1/200);
1748 c = avenrun[2] + (FIXED_1/200);
1749 - len = sprintf(page,"%d.%02d %d.%02d %d.%02d %d/%d %d\n",
1750 + len = sprintf(page,"%d.%02d %d.%02d %d.%02d %ld/%d %d\n",
1751 LOAD_INT(a), LOAD_FRAC(a),
1752 LOAD_INT(b), LOAD_FRAC(b),
1753 LOAD_INT(c), LOAD_FRAC(c),
1754 - nr_running, nr_threads, last_pid);
1755 + nr_running(), nr_threads, last_pid);
1756 return proc_calc_metrics(page, start, off, count, eof, len);
1763 - idle = init_tasks[0]->times.tms_utime + init_tasks[0]->times.tms_stime;
1764 + idle = init_task.times.tms_utime + init_task.times.tms_stime;
1766 /* The formula for the fraction parts really is ((t * 100) / HZ) % 100, but
1767 that would overflow about every five days at HZ == 100.
1768 @@ -371,10 +371,10 @@
1771 proc_sprintf(page, &off, &len,
1776 - kstat.context_swtch,
1777 + nr_context_switches(),
1778 xtime.tv_sec - jif / HZ,
1781 diff -urN linux-2.4.20/fs/reiserfs/buffer2.c linux-2.4.20-o1/fs/reiserfs/buffer2.c
1782 --- linux-2.4.20/fs/reiserfs/buffer2.c Fri Nov 29 00:53:15 2002
1783 +++ linux-2.4.20-o1/fs/reiserfs/buffer2.c Wed Mar 12 00:41:43 2003
1785 struct buffer_head * reiserfs_bread (struct super_block *super, int n_block, int n_size)
1787 struct buffer_head *result;
1788 - PROC_EXP( unsigned int ctx_switches = kstat.context_swtch );
1789 + PROC_EXP( unsigned int ctx_switches = nr_context_switches(); );
1791 result = bread (super -> s_dev, n_block, n_size);
1792 PROC_INFO_INC( super, breads );
1793 - PROC_EXP( if( kstat.context_swtch != ctx_switches )
1794 + PROC_EXP( if( nr_context_switches() != ctx_switches )
1795 PROC_INFO_INC( super, bread_miss ) );
1798 diff -urN linux-2.4.20/include/asm-alpha/bitops.h linux-2.4.20-o1/include/asm-alpha/bitops.h
1799 --- linux-2.4.20/include/asm-alpha/bitops.h Sat Oct 13 00:35:54 2001
1800 +++ linux-2.4.20-o1/include/asm-alpha/bitops.h Wed Mar 12 00:41:43 2003
1803 #include <linux/config.h>
1804 #include <linux/kernel.h>
1805 +#include <asm/compiler.h>
1808 * Copyright 1994, Linus Torvalds.
1811 __asm__ __volatile__(
1820 :"=&r" (temp), "=m" (*m)
1821 - :"Ir" (~(1UL << (nr & 31))), "m" (*m));
1822 + :"Ir" (1UL << (nr & 31)), "m" (*m));
1826 * WARNING: non atomic version.
1828 static __inline__ void
1829 -__change_bit(unsigned long nr, volatile void * addr)
1830 +__clear_bit(unsigned long nr, volatile void * addr)
1832 int *m = ((int *) addr) + (nr >> 5);
1834 - *m ^= 1 << (nr & 31);
1835 + *m &= ~(1 << (nr & 31));
1840 :"Ir" (1UL << (nr & 31)), "m" (*m));
1844 + * WARNING: non atomic version.
1846 +static __inline__ void
1847 +__change_bit(unsigned long nr, volatile void * addr)
1849 + int *m = ((int *) addr) + (nr >> 5);
1851 + *m ^= 1 << (nr & 31);
1855 test_and_set_bit(unsigned long nr, volatile void *addr)
1857 @@ -181,20 +193,6 @@
1858 return (old & mask) != 0;
1862 - * WARNING: non atomic version.
1864 -static __inline__ int
1865 -__test_and_change_bit(unsigned long nr, volatile void * addr)
1867 - unsigned long mask = 1 << (nr & 0x1f);
1868 - int *m = ((int *) addr) + (nr >> 5);
1872 - return (old & mask) != 0;
1876 test_and_change_bit(unsigned long nr, volatile void * addr)
1878 @@ -220,6 +218,20 @@
1883 + * WARNING: non atomic version.
1885 +static __inline__ int
1886 +__test_and_change_bit(unsigned long nr, volatile void * addr)
1888 + unsigned long mask = 1 << (nr & 0x1f);
1889 + int *m = ((int *) addr) + (nr >> 5);
1893 + return (old & mask) != 0;
1897 test_bit(int nr, volatile void * addr)
1899 @@ -235,12 +247,15 @@
1901 static inline unsigned long ffz_b(unsigned long x)
1903 - unsigned long sum = 0;
1904 + unsigned long sum, x1, x2, x4;
1906 x = ~x & -~x; /* set first 0 bit, clear others */
1907 - if (x & 0xF0) sum += 4;
1908 - if (x & 0xCC) sum += 2;
1909 - if (x & 0xAA) sum += 1;
1914 + sum += (x4 != 0) * 4;
1919 @@ -257,24 +272,46 @@
1921 __asm__("cmpbge %1,%2,%0" : "=r"(bits) : "r"(word), "r"(~0UL));
1923 - __asm__("extbl %1,%2,%0" : "=r"(bits) : "r"(word), "r"(qofs));
1924 + bits = __kernel_extbl(word, qofs);
1927 return qofs*8 + bofs;
1932 + * __ffs = Find First set bit in word. Undefined if no set bit exists.
1934 +static inline unsigned long __ffs(unsigned long word)
1936 +#if defined(__alpha_cix__) && defined(__alpha_fix__)
1937 + /* Whee. EV67 can calculate it directly. */
1938 + unsigned long result;
1939 + __asm__("cttz %1,%0" : "=r"(result) : "r"(word));
1942 + unsigned long bits, qofs, bofs;
1944 + __asm__("cmpbge $31,%1,%0" : "=r"(bits) : "r"(word));
1945 + qofs = ffz_b(bits);
1946 + bits = __kernel_extbl(word, qofs);
1947 + bofs = ffz_b(~bits);
1949 + return qofs*8 + bofs;
1956 * ffs: find first bit set. This is defined the same way as
1957 * the libc and compiler builtin ffs routines, therefore
1958 - * differs in spirit from the above ffz (man ffs).
1959 + * differs in spirit from the above __ffs.
1962 static inline int ffs(int word)
1964 - int result = ffz(~word);
1965 + int result = __ffs(word);
1966 return word ? result+1 : 0;
1969 @@ -316,6 +353,14 @@
1970 #define hweight16(x) hweight64((x) & 0xfffful)
1971 #define hweight8(x) hweight64((x) & 0xfful)
1973 +static inline unsigned long hweight64(unsigned long w)
1975 + unsigned long result;
1976 + for (result = 0; w ; w >>= 1)
1977 + result += (w & 1);
1981 #define hweight32(x) generic_hweight32(x)
1982 #define hweight16(x) generic_hweight16(x)
1983 #define hweight8(x) generic_hweight8(x)
1984 @@ -365,12 +410,76 @@
1988 - * The optimizer actually does good code for this case..
1989 + * Find next one bit in a bitmap reasonably efficiently.
1991 +static inline unsigned long
1992 +find_next_bit(void * addr, unsigned long size, unsigned long offset)
1994 + unsigned long * p = ((unsigned long *) addr) + (offset >> 6);
1995 + unsigned long result = offset & ~63UL;
1996 + unsigned long tmp;
1998 + if (offset >= size)
2004 + tmp &= ~0UL << offset;
2008 + goto found_middle;
2012 + while (size & ~63UL) {
2013 + if ((tmp = *(p++)))
2014 + goto found_middle;
2022 + tmp &= ~0UL >> (64 - size);
2024 + return result + size;
2026 + return result + __ffs(tmp);
2030 + * The optimizer actually does good code for this case.
2032 #define find_first_zero_bit(addr, size) \
2033 find_next_zero_bit((addr), (size), 0)
2034 +#define find_first_bit(addr, size) \
2035 + find_next_bit((addr), (size), 0)
2040 + * Every architecture must define this function. It's the fastest
2041 + * way of searching a 140-bit bitmap where the first 100 bits are
2042 + * unlikely to be set. It's guaranteed that at least one of the 140
2045 +static inline unsigned long
2046 +sched_find_first_bit(unsigned long b[3])
2048 + unsigned long b0 = b[0], b1 = b[1], b2 = b[2];
2049 + unsigned long ofs;
2051 + ofs = (b1 ? 64 : 128);
2052 + b1 = (b1 ? b1 : b2);
2053 + ofs = (b0 ? 0 : ofs);
2054 + b0 = (b0 ? b0 : b1);
2056 + return __ffs(b0) + ofs;
2060 #define ext2_set_bit __test_and_set_bit
2061 #define ext2_clear_bit __test_and_clear_bit
2062 diff -urN linux-2.4.20/include/asm-alpha/smp.h linux-2.4.20-o1/include/asm-alpha/smp.h
2063 --- linux-2.4.20/include/asm-alpha/smp.h Fri Sep 14 00:21:32 2001
2064 +++ linux-2.4.20-o1/include/asm-alpha/smp.h Wed Mar 12 00:41:43 2003
2066 #define cpu_logical_map(cpu) __cpu_logical_map[cpu]
2068 #define hard_smp_processor_id() __hard_smp_processor_id()
2069 -#define smp_processor_id() (current->processor)
2070 +#define smp_processor_id() (current->cpu)
2072 extern unsigned long cpu_present_mask;
2073 #define cpu_online_map cpu_present_mask
2074 diff -urN linux-2.4.20/include/asm-alpha/system.h linux-2.4.20-o1/include/asm-alpha/system.h
2075 --- linux-2.4.20/include/asm-alpha/system.h Fri Oct 5 03:47:08 2001
2076 +++ linux-2.4.20-o1/include/asm-alpha/system.h Wed Mar 12 00:41:43 2003
2078 extern void halt(void) __attribute__((noreturn));
2079 #define __halt() __asm__ __volatile__ ("call_pal %0 #halt" : : "i" (PAL_halt))
2081 -#define prepare_to_switch() do { } while(0)
2082 #define switch_to(prev,next,last) \
2084 unsigned long pcbb; \
2085 diff -urN linux-2.4.20/include/asm-arm/bitops.h linux-2.4.20-o1/include/asm-arm/bitops.h
2086 --- linux-2.4.20/include/asm-arm/bitops.h Sun Aug 12 20:14:00 2001
2087 +++ linux-2.4.20-o1/include/asm-arm/bitops.h Wed Mar 12 00:41:43 2003
2089 * Copyright 1995, Russell King.
2090 * Various bits and pieces copyrights include:
2091 * Linus Torvalds (test_bit).
2092 + * Big endian support: Copyright 2001, Nicolas Pitre
2093 + * reworked by rmk.
2095 * bit 0 is the LSB of addr; bit 32 is the LSB of (addr+1).
2097 @@ -17,81 +19,271 @@
2101 +#include <asm/system.h>
2103 #define smp_mb__before_clear_bit() do { } while (0)
2104 #define smp_mb__after_clear_bit() do { } while (0)
2107 - * Function prototypes to keep gcc -Wall happy.
2108 + * These functions are the basis of our bit ops.
2109 + * First, the atomic bitops.
2111 + * The endian issue for these functions is handled by the macros below.
2113 -extern void set_bit(int nr, volatile void * addr);
2115 +____atomic_set_bit_mask(unsigned int mask, volatile unsigned char *p)
2117 + unsigned long flags;
2119 + local_irq_save(flags);
2121 + local_irq_restore(flags);
2125 +____atomic_clear_bit_mask(unsigned int mask, volatile unsigned char *p)
2127 + unsigned long flags;
2129 + local_irq_save(flags);
2131 + local_irq_restore(flags);
2135 +____atomic_change_bit_mask(unsigned int mask, volatile unsigned char *p)
2137 + unsigned long flags;
2139 + local_irq_save(flags);
2141 + local_irq_restore(flags);
2144 -static inline void __set_bit(int nr, volatile void *addr)
2146 +____atomic_test_and_set_bit_mask(unsigned int mask, volatile unsigned char *p)
2148 - ((unsigned char *) addr)[nr >> 3] |= (1U << (nr & 7));
2149 + unsigned long flags;
2152 + local_irq_save(flags);
2155 + local_irq_restore(flags);
2157 + return res & mask;
2160 -extern void clear_bit(int nr, volatile void * addr);
2162 +____atomic_test_and_clear_bit_mask(unsigned int mask, volatile unsigned char *p)
2164 + unsigned long flags;
2167 + local_irq_save(flags);
2170 + local_irq_restore(flags);
2172 + return res & mask;
2175 -static inline void __clear_bit(int nr, volatile void *addr)
2177 +____atomic_test_and_change_bit_mask(unsigned int mask, volatile unsigned char *p)
2179 - ((unsigned char *) addr)[nr >> 3] &= ~(1U << (nr & 7));
2180 + unsigned long flags;
2183 + local_irq_save(flags);
2186 + local_irq_restore(flags);
2188 + return res & mask;
2191 -extern void change_bit(int nr, volatile void * addr);
2193 + * Now the non-atomic variants. We let the compiler handle all optimisations
2196 +static inline void ____nonatomic_set_bit(int nr, volatile void *p)
2198 + ((unsigned char *) p)[nr >> 3] |= (1U << (nr & 7));
2201 -static inline void __change_bit(int nr, volatile void *addr)
2202 +static inline void ____nonatomic_clear_bit(int nr, volatile void *p)
2204 - ((unsigned char *) addr)[nr >> 3] ^= (1U << (nr & 7));
2205 + ((unsigned char *) p)[nr >> 3] &= ~(1U << (nr & 7));
2208 -extern int test_and_set_bit(int nr, volatile void * addr);
2209 +static inline void ____nonatomic_change_bit(int nr, volatile void *p)
2211 + ((unsigned char *) p)[nr >> 3] ^= (1U << (nr & 7));
2214 -static inline int __test_and_set_bit(int nr, volatile void *addr)
2215 +static inline int ____nonatomic_test_and_set_bit(int nr, volatile void *p)
2217 unsigned int mask = 1 << (nr & 7);
2218 unsigned int oldval;
2220 - oldval = ((unsigned char *) addr)[nr >> 3];
2221 - ((unsigned char *) addr)[nr >> 3] = oldval | mask;
2222 + oldval = ((unsigned char *) p)[nr >> 3];
2223 + ((unsigned char *) p)[nr >> 3] = oldval | mask;
2224 return oldval & mask;
2227 -extern int test_and_clear_bit(int nr, volatile void * addr);
2229 -static inline int __test_and_clear_bit(int nr, volatile void *addr)
2230 +static inline int ____nonatomic_test_and_clear_bit(int nr, volatile void *p)
2232 unsigned int mask = 1 << (nr & 7);
2233 unsigned int oldval;
2235 - oldval = ((unsigned char *) addr)[nr >> 3];
2236 - ((unsigned char *) addr)[nr >> 3] = oldval & ~mask;
2237 + oldval = ((unsigned char *) p)[nr >> 3];
2238 + ((unsigned char *) p)[nr >> 3] = oldval & ~mask;
2239 return oldval & mask;
2242 -extern int test_and_change_bit(int nr, volatile void * addr);
2244 -static inline int __test_and_change_bit(int nr, volatile void *addr)
2245 +static inline int ____nonatomic_test_and_change_bit(int nr, volatile void *p)
2247 unsigned int mask = 1 << (nr & 7);
2248 unsigned int oldval;
2250 - oldval = ((unsigned char *) addr)[nr >> 3];
2251 - ((unsigned char *) addr)[nr >> 3] = oldval ^ mask;
2252 + oldval = ((unsigned char *) p)[nr >> 3];
2253 + ((unsigned char *) p)[nr >> 3] = oldval ^ mask;
2254 return oldval & mask;
2257 -extern int find_first_zero_bit(void * addr, unsigned size);
2258 -extern int find_next_zero_bit(void * addr, int size, int offset);
2261 * This routine doesn't need to be atomic.
2263 -static inline int test_bit(int nr, const void * addr)
2264 +static inline int ____test_bit(int nr, const void * p)
2266 - return ((unsigned char *) addr)[nr >> 3] & (1U << (nr & 7));
2267 + return ((volatile unsigned char *) p)[nr >> 3] & (1U << (nr & 7));
2271 + * A note about Endian-ness.
2272 + * -------------------------
2274 + * When the ARM is put into big endian mode via CR15, the processor
2275 + * merely swaps the order of bytes within words, thus:
2277 + * ------------ physical data bus bits -----------
2278 + * D31 ... D24 D23 ... D16 D15 ... D8 D7 ... D0
2279 + * little byte 3 byte 2 byte 1 byte 0
2280 + * big byte 0 byte 1 byte 2 byte 3
2282 + * This means that reading a 32-bit word at address 0 returns the same
2283 + * value irrespective of the endian mode bit.
2285 + * Peripheral devices should be connected with the data bus reversed in
2286 + * "Big Endian" mode. ARM Application Note 61 is applicable, and is
2287 + * available from http://www.arm.com/.
2289 + * The following assumes that the data bus connectivity for big endian
2290 + * mode has been followed.
2292 + * Note that bit 0 is defined to be 32-bit word bit 0, not byte 0 bit 0.
2296 + * Little endian assembly bitops. nr = 0 -> byte 0 bit 0.
2298 +extern void _set_bit_le(int nr, volatile void * p);
2299 +extern void _clear_bit_le(int nr, volatile void * p);
2300 +extern void _change_bit_le(int nr, volatile void * p);
2301 +extern int _test_and_set_bit_le(int nr, volatile void * p);
2302 +extern int _test_and_clear_bit_le(int nr, volatile void * p);
2303 +extern int _test_and_change_bit_le(int nr, volatile void * p);
2304 +extern int _find_first_zero_bit_le(void * p, unsigned size);
2305 +extern int _find_next_zero_bit_le(void * p, int size, int offset);
2308 + * Big endian assembly bitops. nr = 0 -> byte 3 bit 0.
2310 +extern void _set_bit_be(int nr, volatile void * p);
2311 +extern void _clear_bit_be(int nr, volatile void * p);
2312 +extern void _change_bit_be(int nr, volatile void * p);
2313 +extern int _test_and_set_bit_be(int nr, volatile void * p);
2314 +extern int _test_and_clear_bit_be(int nr, volatile void * p);
2315 +extern int _test_and_change_bit_be(int nr, volatile void * p);
2316 +extern int _find_first_zero_bit_be(void * p, unsigned size);
2317 +extern int _find_next_zero_bit_be(void * p, int size, int offset);
2321 + * The __* form of bitops are non-atomic and may be reordered.
2323 +#define ATOMIC_BITOP_LE(name,nr,p) \
2324 + (__builtin_constant_p(nr) ? \
2325 + ____atomic_##name##_mask(1 << ((nr) & 7), \
2326 + ((unsigned char *)(p)) + ((nr) >> 3)) : \
2327 + _##name##_le(nr,p))
2329 +#define ATOMIC_BITOP_BE(name,nr,p) \
2330 + (__builtin_constant_p(nr) ? \
2331 + ____atomic_##name##_mask(1 << ((nr) & 7), \
2332 + ((unsigned char *)(p)) + (((nr) >> 3) ^ 3)) : \
2333 + _##name##_be(nr,p))
2335 +#define NONATOMIC_BITOP_LE(name,nr,p) \
2336 + (____nonatomic_##name(nr, p))
2338 +#define NONATOMIC_BITOP_BE(name,nr,p) \
2339 + (____nonatomic_##name(nr ^ 0x18, p))
2343 + * These are the little endian, atomic definitions.
2345 +#define set_bit(nr,p) ATOMIC_BITOP_LE(set_bit,nr,p)
2346 +#define clear_bit(nr,p) ATOMIC_BITOP_LE(clear_bit,nr,p)
2347 +#define change_bit(nr,p) ATOMIC_BITOP_LE(change_bit,nr,p)
2348 +#define test_and_set_bit(nr,p) ATOMIC_BITOP_LE(test_and_set_bit,nr,p)
2349 +#define test_and_clear_bit(nr,p) ATOMIC_BITOP_LE(test_and_clear_bit,nr,p)
2350 +#define test_and_change_bit(nr,p) ATOMIC_BITOP_LE(test_and_change_bit,nr,p)
2351 +#define test_bit(nr,p) ____test_bit(nr,p)
2352 +#define find_first_zero_bit(p,sz) _find_first_zero_bit_le(p,sz)
2353 +#define find_next_zero_bit(p,sz,off) _find_next_zero_bit_le(p,sz,off)
2356 + * These are the little endian, non-atomic definitions.
2358 +#define __set_bit(nr,p) NONATOMIC_BITOP_LE(set_bit,nr,p)
2359 +#define __clear_bit(nr,p) NONATOMIC_BITOP_LE(clear_bit,nr,p)
2360 +#define __change_bit(nr,p) NONATOMIC_BITOP_LE(change_bit,nr,p)
2361 +#define __test_and_set_bit(nr,p) NONATOMIC_BITOP_LE(test_and_set_bit,nr,p)
2362 +#define __test_and_clear_bit(nr,p) NONATOMIC_BITOP_LE(test_and_clear_bit,nr,p)
2363 +#define __test_and_change_bit(nr,p) NONATOMIC_BITOP_LE(test_and_change_bit,nr,p)
2364 +#define __test_bit(nr,p) ____test_bit(nr,p)
2369 + * These are the big endian, atomic definitions.
2371 +#define set_bit(nr,p) ATOMIC_BITOP_BE(set_bit,nr,p)
2372 +#define clear_bit(nr,p) ATOMIC_BITOP_BE(clear_bit,nr,p)
2373 +#define change_bit(nr,p) ATOMIC_BITOP_BE(change_bit,nr,p)
2374 +#define test_and_set_bit(nr,p) ATOMIC_BITOP_BE(test_and_set_bit,nr,p)
2375 +#define test_and_clear_bit(nr,p) ATOMIC_BITOP_BE(test_and_clear_bit,nr,p)
2376 +#define test_and_change_bit(nr,p) ATOMIC_BITOP_BE(test_and_change_bit,nr,p)
2377 +#define test_bit(nr,p) ____test_bit((nr) ^ 0x18, p)
2378 +#define find_first_zero_bit(p,sz) _find_first_zero_bit_be(p,sz)
2379 +#define find_next_zero_bit(p,sz,off) _find_next_zero_bit_be(p,sz,off)
2382 + * These are the big endian, non-atomic definitions.
2384 +#define __set_bit(nr,p) NONATOMIC_BITOP_BE(set_bit,nr,p)
2385 +#define __clear_bit(nr,p) NONATOMIC_BITOP_BE(clear_bit,nr,p)
2386 +#define __change_bit(nr,p) NONATOMIC_BITOP_BE(change_bit,nr,p)
2387 +#define __test_and_set_bit(nr,p) NONATOMIC_BITOP_BE(test_and_set_bit,nr,p)
2388 +#define __test_and_clear_bit(nr,p) NONATOMIC_BITOP_BE(test_and_clear_bit,nr,p)
2389 +#define __test_and_change_bit(nr,p) NONATOMIC_BITOP_BE(test_and_change_bit,nr,p)
2390 +#define __test_bit(nr,p) ____test_bit((nr) ^ 0x18, p)
2395 * ffz = Find First Zero in word. Undefined if no zero exists,
2396 * so code should check against ~0UL first..
2398 @@ -110,6 +302,29 @@
2402 + * ffz = Find First Zero in word. Undefined if no zero exists,
2403 + * so code should check against ~0UL first..
2405 +static inline unsigned long __ffs(unsigned long word)
2410 + if (word & 0x0000ffff) { k -= 16; word <<= 16; }
2411 + if (word & 0x00ff0000) { k -= 8; word <<= 8; }
2412 + if (word & 0x0f000000) { k -= 4; word <<= 4; }
2413 + if (word & 0x30000000) { k -= 2; word <<= 2; }
2414 + if (word & 0x40000000) { k -= 1; }
2419 + * fls: find last bit set.
2422 +#define fls(x) generic_fls(x)
2425 * ffs: find first bit set. This is defined the same way as
2426 * the libc and compiler builtin ffs routines, therefore
2427 * differs in spirit from the above ffz (man ffs).
2428 @@ -118,6 +333,22 @@
2429 #define ffs(x) generic_ffs(x)
2432 + * Find first bit set in a 168-bit bitmap, where the first
2433 + * 128 bits are unlikely to be set.
2435 +static inline int sched_find_first_bit(unsigned long *b)
2440 + for (off = 0; v = b[off], off < 4; off++) {
2444 + return __ffs(v) + off * 32;
2448 * hweightN: returns the hamming weight (i.e. the number
2449 * of bits set) of a N-bit word
2451 @@ -126,18 +357,25 @@
2452 #define hweight16(x) generic_hweight16(x)
2453 #define hweight8(x) generic_hweight8(x)
2455 -#define ext2_set_bit test_and_set_bit
2456 -#define ext2_clear_bit test_and_clear_bit
2457 -#define ext2_test_bit test_bit
2458 -#define ext2_find_first_zero_bit find_first_zero_bit
2459 -#define ext2_find_next_zero_bit find_next_zero_bit
2461 -/* Bitmap functions for the minix filesystem. */
2462 -#define minix_test_and_set_bit(nr,addr) test_and_set_bit(nr,addr)
2463 -#define minix_set_bit(nr,addr) set_bit(nr,addr)
2464 -#define minix_test_and_clear_bit(nr,addr) test_and_clear_bit(nr,addr)
2465 -#define minix_test_bit(nr,addr) test_bit(nr,addr)
2466 -#define minix_find_first_zero_bit(addr,size) find_first_zero_bit(addr,size)
2468 + * Ext2 is defined to use little-endian byte ordering.
2469 + * These do not need to be atomic.
2471 +#define ext2_set_bit(nr,p) NONATOMIC_BITOP_LE(test_and_set_bit,nr,p)
2472 +#define ext2_clear_bit(nr,p) NONATOMIC_BITOP_LE(test_and_clear_bit,nr,p)
2473 +#define ext2_test_bit(nr,p) __test_bit(nr,p)
2474 +#define ext2_find_first_zero_bit(p,sz) _find_first_zero_bit_le(p,sz)
2475 +#define ext2_find_next_zero_bit(p,sz,off) _find_next_zero_bit_le(p,sz,off)
2478 + * Minix is defined to use little-endian byte ordering.
2479 + * These do not need to be atomic.
2481 +#define minix_set_bit(nr,p) NONATOMIC_BITOP_LE(set_bit,nr,p)
2482 +#define minix_test_bit(nr,p) __test_bit(nr,p)
2483 +#define minix_test_and_set_bit(nr,p) NONATOMIC_BITOP_LE(test_and_set_bit,nr,p)
2484 +#define minix_test_and_clear_bit(nr,p) NONATOMIC_BITOP_LE(test_and_clear_bit,nr,p)
2485 +#define minix_find_first_zero_bit(p,sz) _find_first_zero_bit_le(p,sz)
2487 #endif /* __KERNEL__ */
2489 diff -urN linux-2.4.20/include/asm-cris/bitops.h linux-2.4.20-o1/include/asm-cris/bitops.h
2490 --- linux-2.4.20/include/asm-cris/bitops.h Mon Feb 25 20:38:10 2002
2491 +++ linux-2.4.20-o1/include/asm-cris/bitops.h Wed Mar 12 00:41:43 2003
2493 /* We use generic_ffs so get it; include guards resolve the possible
2494 mutually inclusion. */
2495 #include <linux/bitops.h>
2496 +#include <linux/compiler.h>
2499 * Some hacks to defeat gcc over-optimizations..
2502 #define set_bit(nr, addr) (void)test_and_set_bit(nr, addr)
2504 +#define __set_bit(nr, addr) (void)__test_and_set_bit(nr, addr)
2507 * clear_bit - Clears a bit in memory
2511 #define clear_bit(nr, addr) (void)test_and_clear_bit(nr, addr)
2513 +#define __clear_bit(nr, addr) (void)__test_and_clear_bit(nr, addr)
2516 * change_bit - Toggle a bit in memory
2519 * It also implies a memory barrier.
2522 -static __inline__ int test_and_set_bit(int nr, void *addr)
2523 +static inline int test_and_set_bit(int nr, void *addr)
2525 unsigned int mask, retval;
2526 unsigned long flags;
2527 @@ -105,6 +110,18 @@
2531 +static inline int __test_and_set_bit(int nr, void *addr)
2533 + unsigned int mask, retval;
2534 + unsigned int *adr = (unsigned int *)addr;
2537 + mask = 1 << (nr & 0x1f);
2538 + retval = (mask & *adr) != 0;
2544 * clear_bit() doesn't provide any barrier for the compiler.
2547 * It also implies a memory barrier.
2550 -static __inline__ int test_and_clear_bit(int nr, void *addr)
2551 +static inline int test_and_clear_bit(int nr, void *addr)
2553 unsigned int mask, retval;
2554 unsigned long flags;
2556 * but actually fail. You must protect multiple accesses with a lock.
2559 -static __inline__ int __test_and_clear_bit(int nr, void *addr)
2560 +static inline int __test_and_clear_bit(int nr, void *addr)
2562 unsigned int mask, retval;
2563 unsigned int *adr = (unsigned int *)addr;
2565 * It also implies a memory barrier.
2568 -static __inline__ int test_and_change_bit(int nr, void *addr)
2569 +static inline int test_and_change_bit(int nr, void *addr)
2571 unsigned int mask, retval;
2572 unsigned long flags;
2575 /* WARNING: non atomic and it can be reordered! */
2577 -static __inline__ int __test_and_change_bit(int nr, void *addr)
2578 +static inline int __test_and_change_bit(int nr, void *addr)
2580 unsigned int mask, retval;
2581 unsigned int *adr = (unsigned int *)addr;
2583 * This routine doesn't need to be atomic.
2586 -static __inline__ int test_bit(int nr, const void *addr)
2587 +static inline int test_bit(int nr, const void *addr)
2590 unsigned int *adr = (unsigned int *)addr;
2592 * number. They differ in that the first function also inverts all bits
2595 -static __inline__ unsigned long cris_swapnwbrlz(unsigned long w)
2596 +static inline unsigned long cris_swapnwbrlz(unsigned long w)
2598 /* Let's just say we return the result in the same register as the
2599 input. Saying we clobber the input but can return the result
2604 -static __inline__ unsigned long cris_swapwbrlz(unsigned long w)
2605 +static inline unsigned long cris_swapwbrlz(unsigned long w)
2608 __asm__ ("swapwbr %0 \n\t"
2610 * ffz = Find First Zero in word. Undefined if no zero exists,
2611 * so code should check against ~0UL first..
2613 -static __inline__ unsigned long ffz(unsigned long w)
2614 +static inline unsigned long ffz(unsigned long w)
2616 /* The generic_ffs function is used to avoid the asm when the
2617 argument is a constant. */
2619 * Somewhat like ffz but the equivalent of generic_ffs: in contrast to
2620 * ffz we return the first one-bit *plus one*.
2622 -static __inline__ unsigned long ffs(unsigned long w)
2623 +static inline unsigned long ffs(unsigned long w)
2625 /* The generic_ffs function is used to avoid the asm when the
2626 argument is a constant. */
2628 * @offset: The bitnumber to start searching at
2629 * @size: The maximum size to search
2631 -static __inline__ int find_next_zero_bit (void * addr, int size, int offset)
2632 +static inline int find_next_zero_bit (void * addr, int size, int offset)
2634 unsigned long *p = ((unsigned long *) addr) + (offset >> 5);
2635 unsigned long result = offset & ~31UL;
2636 @@ -354,7 +371,45 @@
2637 #define minix_test_bit(nr,addr) test_bit(nr,addr)
2638 #define minix_find_first_zero_bit(addr,size) find_first_zero_bit(addr,size)
2640 -#endif /* __KERNEL__ */
2642 +/* TODO: see below */
2643 +#define sched_find_first_zero_bit(addr) find_first_zero_bit(addr, 168)
2646 +/* TODO: left out pending where to put it.. (there are .h dependencies) */
2649 + * Every architecture must define this function. It's the fastest
2650 + * way of searching a 168-bit bitmap where the first 128 bits are
2651 + * unlikely to be set. It's guaranteed that at least one of the 168
2652 + * bits is cleared.
2655 +#if MAX_RT_PRIO != 128 || MAX_PRIO != 168
2656 +# error update this function.
2659 +#define MAX_RT_PRIO 128
2660 +#define MAX_PRIO 168
2663 +static inline int sched_find_first_zero_bit(char *bitmap)
2665 + unsigned int *b = (unsigned int *)bitmap;
2668 + rt = b[0] & b[1] & b[2] & b[3];
2669 + if (unlikely(rt != 0xffffffff))
2670 + return find_first_zero_bit(bitmap, MAX_RT_PRIO);
2673 + return ffz(b[4]) + MAX_RT_PRIO;
2674 + return ffz(b[5]) + 32 + MAX_RT_PRIO;
2680 +#endif /* __KERNEL__ */
2682 #endif /* _CRIS_BITOPS_H */
2683 diff -urN linux-2.4.20/include/asm-generic/bitops.h linux-2.4.20-o1/include/asm-generic/bitops.h
2684 --- linux-2.4.20/include/asm-generic/bitops.h Tue Nov 28 02:47:38 2000
2685 +++ linux-2.4.20-o1/include/asm-generic/bitops.h Wed Mar 12 00:41:43 2003
2687 return ((mask & *addr) != 0);
2691 + * fls: find last bit set.
2694 +#define fls(x) generic_fls(x)
2699 diff -urN linux-2.4.20/include/asm-i386/bitops.h linux-2.4.20-o1/include/asm-i386/bitops.h
2700 --- linux-2.4.20/include/asm-i386/bitops.h Fri Nov 29 00:53:15 2002
2701 +++ linux-2.4.20-o1/include/asm-i386/bitops.h Wed Mar 12 00:41:43 2003
2705 #include <linux/config.h>
2706 +#include <linux/compiler.h>
2709 * These have to be done with inline assembly: that way the bit-setting
2715 +static __inline__ void __clear_bit(int nr, volatile void * addr)
2717 + __asm__ __volatile__(
2722 #define smp_mb__before_clear_bit() barrier()
2723 #define smp_mb__after_clear_bit() barrier()
2725 @@ -284,6 +293,34 @@
2729 + * find_first_bit - find the first set bit in a memory region
2730 + * @addr: The address to start the search at
2731 + * @size: The maximum size to search
2733 + * Returns the bit-number of the first set bit, not the number of the byte
2734 + * containing a bit.
2736 +static __inline__ int find_first_bit(void * addr, unsigned size)
2741 + /* This looks at memory. Mark it volatile to tell gcc not to move it around */
2742 + __asm__ __volatile__(
2743 + "xorl %%eax,%%eax\n\t"
2746 + "leal -4(%%edi),%%edi\n\t"
2747 + "bsfl (%%edi),%%eax\n"
2748 + "1:\tsubl %%ebx,%%edi\n\t"
2749 + "shll $3,%%edi\n\t"
2750 + "addl %%edi,%%eax"
2751 + :"=a" (res), "=&c" (d0), "=&D" (d1)
2752 + :"1" ((size + 31) >> 5), "2" (addr), "b" (addr));
2757 * find_next_zero_bit - find the first zero bit in a memory region
2758 * @addr: The address to base the search on
2759 * @offset: The bitnumber to start searching at
2764 - * Look for zero in first byte
2765 + * Look for zero in the first 32 bits.
2767 __asm__("bsfl %1,%0\n\t"
2769 @@ -317,6 +354,39 @@
2773 + * find_next_bit - find the first set bit in a memory region
2774 + * @addr: The address to base the search on
2775 + * @offset: The bitnumber to start searching at
2776 + * @size: The maximum size to search
2778 +static __inline__ int find_next_bit (void * addr, int size, int offset)
2780 + unsigned long * p = ((unsigned long *) addr) + (offset >> 5);
2781 + int set = 0, bit = offset & 31, res;
2785 + * Look for nonzero in the first 32 bits:
2787 + __asm__("bsfl %1,%0\n\t"
2792 + : "r" (*p >> bit));
2793 + if (set < (32 - bit))
2794 + return set + offset;
2799 + * No set bit yet, search remaining full words for a bit
2801 + res = find_first_bit (p, size - 32 * (p - (unsigned long *) addr));
2802 + return (offset + set + res);
2806 * ffz - find first zero in word.
2807 * @word: The word to search
2809 @@ -330,7 +400,40 @@
2814 + * __ffs - find first bit in word.
2815 + * @word: The word to search
2817 + * Undefined if no bit exists, so code should check against 0 first.
2819 +static __inline__ unsigned long __ffs(unsigned long word)
2821 + __asm__("bsfl %1,%0"
2830 + * Every architecture must define this function. It's the fastest
2831 + * way of searching a 140-bit bitmap where the first 100 bits are
2832 + * unlikely to be set. It's guaranteed that at least one of the 140
2833 + * bits is cleared.
2835 +static inline int sched_find_first_bit(unsigned long *b)
2837 + if (unlikely(b[0]))
2838 + return __ffs(b[0]);
2839 + if (unlikely(b[1]))
2840 + return __ffs(b[1]) + 32;
2841 + if (unlikely(b[2]))
2842 + return __ffs(b[2]) + 64;
2844 + return __ffs(b[3]) + 96;
2845 + return __ffs(b[4]) + 128;
2849 * ffs - find first bit set
2850 diff -urN linux-2.4.20/include/asm-i386/mmu_context.h linux-2.4.20-o1/include/asm-i386/mmu_context.h
2851 --- linux-2.4.20/include/asm-i386/mmu_context.h Sat Aug 3 02:39:45 2002
2852 +++ linux-2.4.20-o1/include/asm-i386/mmu_context.h Wed Mar 12 00:41:43 2003
2855 static inline void switch_mm(struct mm_struct *prev, struct mm_struct *next, struct task_struct *tsk, unsigned cpu)
2857 - if (prev != next) {
2858 + if (likely(prev != next)) {
2859 /* stop flush ipis for the previous mm */
2860 clear_bit(cpu, &prev->cpu_vm_mask);
2862 * Re-load LDT if necessary
2864 - if (prev->context.segments != next->context.segments)
2865 + if (unlikely(prev->context.segments != next->context.segments))
2868 cpu_tlbstate[cpu].state = TLBSTATE_OK;
2869 diff -urN linux-2.4.20/include/asm-i386/pgalloc.h linux-2.4.20-o1/include/asm-i386/pgalloc.h
2870 --- linux-2.4.20/include/asm-i386/pgalloc.h Sat Aug 3 02:39:45 2002
2871 +++ linux-2.4.20-o1/include/asm-i386/pgalloc.h Wed Mar 12 00:41:43 2003
2874 struct mm_struct *active_mm;
2876 + char __cacheline_padding[24];
2878 extern struct tlb_state cpu_tlbstate[NR_CPUS];
2880 diff -urN linux-2.4.20/include/asm-i386/processor.h linux-2.4.20-o1/include/asm-i386/processor.h
2881 --- linux-2.4.20/include/asm-i386/processor.h Sat Aug 3 02:39:45 2002
2882 +++ linux-2.4.20-o1/include/asm-i386/processor.h Wed Mar 12 00:41:43 2003
2885 #define cpu_relax() rep_nop()
2887 +#define ARCH_HAS_SMP_BALANCE
2889 /* Prefetch instructions for Pentium III and AMD Athlon */
2890 #ifdef CONFIG_MPENTIUMIII
2892 diff -urN linux-2.4.20/include/asm-i386/smp.h linux-2.4.20-o1/include/asm-i386/smp.h
2893 --- linux-2.4.20/include/asm-i386/smp.h Fri Nov 29 00:53:15 2002
2894 +++ linux-2.4.20-o1/include/asm-i386/smp.h Wed Mar 12 00:41:43 2003
2896 extern void smp_flush_tlb(void);
2897 extern void smp_message_irq(int cpl, void *dev_id, struct pt_regs *regs);
2898 extern void smp_send_reschedule(int cpu);
2899 +extern void smp_send_reschedule_all(void);
2900 extern void smp_invalidate_rcv(void); /* Process an NMI */
2901 extern void (*mtrr_hook) (void);
2902 extern void zap_low_mappings (void);
2904 * so this is correct in the x86 case.
2907 -#define smp_processor_id() (current->processor)
2908 +#define smp_processor_id() (current->cpu)
2910 static __inline int hard_smp_processor_id(void)
2913 #endif /* !__ASSEMBLY__ */
2915 #define NO_PROC_ID 0xFF /* No processor magic marker */
2918 - * This magic constant controls our willingness to transfer
2919 - * a process across CPUs. Such a transfer incurs misses on the L1
2920 - * cache, and on a P6 or P5 with multiple L2 caches L2 hits. My
2921 - * gut feeling is this will vary by board in value. For a board
2922 - * with separate L2 cache it probably depends also on the RSS, and
2923 - * for a board with shared L2 cache it ought to decay fast as other
2924 - * processes are run.
2927 -#define PROC_CHANGE_PENALTY 15 /* Schedule penalty */
2931 diff -urN linux-2.4.20/include/asm-i386/smp_balance.h linux-2.4.20-o1/include/asm-i386/smp_balance.h
2932 --- linux-2.4.20/include/asm-i386/smp_balance.h Thu Jan 1 01:00:00 1970
2933 +++ linux-2.4.20-o1/include/asm-i386/smp_balance.h Wed Mar 12 00:41:43 2003
2935 +#ifndef _ASM_SMP_BALANCE_H
2936 +#define _ASM_SMP_BALANCE_H
2939 + * We have an architecture-specific SMP load balancer to improve
2940 + * scheduling behavior on hyperthreaded CPUs. Since only P4s have
2941 + * HT, maybe this should be conditional on CONFIG_MPENTIUM4...
2946 + * Find any idle processor package (i.e. both virtual processors are idle)
2948 +static inline int find_idle_package(int this_cpu)
2952 + this_cpu = cpu_number_map(this_cpu);
2954 + for (i = (this_cpu + 1) % smp_num_cpus;
2956 + i = (i + 1) % smp_num_cpus) {
2957 + int physical = cpu_logical_map(i);
2958 + int sibling = cpu_sibling_map[physical];
2960 + if (idle_cpu(physical) && idle_cpu(sibling))
2963 + return -1; /* not found */
2966 +static inline int arch_reschedule_idle_override(task_t * p, int idle)
2968 + if (unlikely(smp_num_siblings > 1) && !idle_cpu(cpu_sibling_map[idle])) {
2969 + int true_idle = find_idle_package(idle);
2970 + if (true_idle >= 0) {
2971 + if (likely(p->cpus_allowed & (1UL << true_idle)))
2974 + true_idle = cpu_sibling_map[true_idle];
2975 + if (p->cpus_allowed & (1UL << true_idle))
2984 +static inline int arch_load_balance(int this_cpu, int idle)
2986 + /* Special hack for hyperthreading */
2987 + if (unlikely(smp_num_siblings > 1 && idle == 2 && !idle_cpu(cpu_sibling_map[this_cpu]))) {
2989 + struct runqueue *rq_target;
2991 + if ((found = find_idle_package(this_cpu)) >= 0 ) {
2992 + rq_target = cpu_rq(found);
2993 + resched_task(rq_target->idle);
3000 +#endif /* _ASM_SMP_BALANCE_H */
3001 diff -urN linux-2.4.20/include/asm-i386/system.h linux-2.4.20-o1/include/asm-i386/system.h
3002 --- linux-2.4.20/include/asm-i386/system.h Fri Nov 29 00:53:15 2002
3003 +++ linux-2.4.20-o1/include/asm-i386/system.h Wed Mar 12 00:41:43 2003
3005 struct task_struct; /* one of the stranger aspects of C forward declarations.. */
3006 extern void FASTCALL(__switch_to(struct task_struct *prev, struct task_struct *next));
3008 -#define prepare_to_switch() do { } while(0)
3009 #define switch_to(prev,next,last) do { \
3010 asm volatile("pushl %%esi\n\t" \
3013 "movl %%esp,%0\n\t" /* save ESP */ \
3014 - "movl %3,%%esp\n\t" /* restore ESP */ \
3015 + "movl %2,%%esp\n\t" /* restore ESP */ \
3016 "movl $1f,%1\n\t" /* save EIP */ \
3017 - "pushl %4\n\t" /* restore EIP */ \
3018 + "pushl %3\n\t" /* restore EIP */ \
3019 "jmp __switch_to\n" \
3024 - :"=m" (prev->thread.esp),"=m" (prev->thread.eip), \
3026 + :"=m" (prev->thread.esp),"=m" (prev->thread.eip) \
3027 :"m" (next->thread.esp),"m" (next->thread.eip), \
3028 - "a" (prev), "d" (next), \
3030 + "a" (prev), "d" (next)); \
3033 #define _set_base(addr,base) do { unsigned long __pr; \
3034 diff -urN linux-2.4.20/include/asm-ia64/bitops.h linux-2.4.20-o1/include/asm-ia64/bitops.h
3035 --- linux-2.4.20/include/asm-ia64/bitops.h Fri Nov 29 00:53:15 2002
3036 +++ linux-2.4.20-o1/include/asm-ia64/bitops.h Wed Mar 12 00:41:43 2003
3038 #define _ASM_IA64_BITOPS_H
3041 - * Copyright (C) 1998-2001 Hewlett-Packard Co
3042 - * Copyright (C) 1998-2001 David Mosberger-Tang <davidm@hpl.hp.com>
3043 + * Copyright (C) 1998-2002 Hewlett-Packard Co
3044 + * David Mosberger-Tang <davidm@hpl.hp.com>
3046 + * 02/06/02 find_next_bit() and find_first_bit() added from Erich Focht's ia64 O(1)
3050 +#include <linux/types.h>
3052 #include <asm/system.h>
3059 + * __clear_bit - Clears a bit in memory (non-atomic version)
3061 +static __inline__ void
3062 +__clear_bit (int nr, volatile void *addr)
3064 + volatile __u32 *p = (__u32 *) addr + (nr >> 5);
3065 + __u32 m = 1 << (nr & 31);
3070 * change_bit - Toggle a bit in memory
3072 * @addr: Address to start counting from
3073 @@ -264,12 +280,11 @@
3077 - * ffz - find the first zero bit in a memory region
3078 - * @x: The address to start the search at
3079 + * ffz - find the first zero bit in a long word
3080 + * @x: The long word to find the bit in
3082 - * Returns the bit-number (0..63) of the first (least significant) zero bit, not
3083 - * the number of the byte containing a bit. Undefined if no zero exists, so
3084 - * code should check against ~0UL first...
3085 + * Returns the bit-number (0..63) of the first (least significant) zero bit. Undefined if
3086 + * no zero exists, so code should check against ~0UL first...
3088 static inline unsigned long
3089 ffz (unsigned long x)
3090 @@ -280,6 +295,21 @@
3095 + * __ffs - find first bit in word.
3096 + * @x: The word to search
3098 + * Undefined if no bit exists, so code should check against 0 first.
3100 +static __inline__ unsigned long
3101 +__ffs (unsigned long x)
3103 + unsigned long result;
3105 + __asm__ ("popcnt %0=%1" : "=r" (result) : "r" ((x - 1) & ~x));
3112 @@ -296,6 +326,12 @@
3113 return exp - 0xffff;
3119 + return ia64_fls((unsigned int) x);
3123 * ffs: find first bit set. This is defined the same way as the libc and compiler builtin
3124 * ffs routines, therefore differs in spirit from the above ffz (man ffs): it operates on
3125 @@ -368,8 +404,53 @@
3127 #define find_first_zero_bit(addr, size) find_next_zero_bit((addr), (size), 0)
3130 + * Find next bit in a bitmap reasonably efficiently..
3133 +find_next_bit (void *addr, unsigned long size, unsigned long offset)
3135 + unsigned long *p = ((unsigned long *) addr) + (offset >> 6);
3136 + unsigned long result = offset & ~63UL;
3137 + unsigned long tmp;
3139 + if (offset >= size)
3145 + tmp &= ~0UL << offset;
3149 + goto found_middle;
3153 + while (size & ~63UL) {
3154 + if ((tmp = *(p++)))
3155 + goto found_middle;
3163 + tmp &= ~0UL >> (64-size);
3164 + if (tmp == 0UL) /* Are any bits set? */
3165 + return result + size; /* Nope. */
3167 + return result + __ffs(tmp);
3170 +#define find_first_bit(addr, size) find_next_bit((addr), (size), 0)
3174 +#define __clear_bit(nr, addr) clear_bit(nr, addr)
3176 #define ext2_set_bit test_and_set_bit
3177 #define ext2_clear_bit test_and_clear_bit
3178 #define ext2_test_bit test_bit
3179 @@ -382,6 +463,16 @@
3180 #define minix_test_and_clear_bit(nr,addr) test_and_clear_bit(nr,addr)
3181 #define minix_test_bit(nr,addr) test_bit(nr,addr)
3182 #define minix_find_first_zero_bit(addr,size) find_first_zero_bit(addr,size)
3185 +sched_find_first_bit (unsigned long *b)
3187 + if (unlikely(b[0]))
3188 + return __ffs(b[0]);
3189 + if (unlikely(b[1]))
3190 + return 64 + __ffs(b[1]);
3191 + return __ffs(b[2]) + 128;
3194 #endif /* __KERNEL__ */
3196 diff -urN linux-2.4.20/include/asm-m68k/bitops.h linux-2.4.20-o1/include/asm-m68k/bitops.h
3197 --- linux-2.4.20/include/asm-m68k/bitops.h Thu Oct 25 22:53:55 2001
3198 +++ linux-2.4.20-o1/include/asm-m68k/bitops.h Wed Mar 12 00:41:43 2003
3200 (__builtin_constant_p(nr) ? \
3201 __constant_clear_bit(nr, vaddr) : \
3202 __generic_clear_bit(nr, vaddr))
3203 +#define __clear_bit(nr,vaddr) clear_bit(nr,vaddr)
3205 extern __inline__ void __constant_clear_bit(int nr, volatile void * vaddr)
3207 @@ -239,6 +240,28 @@
3211 +#define __ffs(x) (ffs(x) - 1)
3215 + * Every architecture must define this function. It's the fastest
3216 + * way of searching a 140-bit bitmap where the first 100 bits are
3217 + * unlikely to be set. It's guaranteed that at least one of the 140
3218 + * bits is cleared.
3220 +static inline int sched_find_first_bit(unsigned long *b)
3222 + if (unlikely(b[0]))
3223 + return __ffs(b[0]);
3224 + if (unlikely(b[1]))
3225 + return __ffs(b[1]) + 32;
3226 + if (unlikely(b[2]))
3227 + return __ffs(b[2]) + 64;
3229 + return __ffs(b[3]) + 96;
3230 + return __ffs(b[4]) + 128;
3235 * hweightN: returns the hamming weight (i.e. the number
3236 diff -urN linux-2.4.20/include/asm-mips/bitops.h linux-2.4.20-o1/include/asm-mips/bitops.h
3237 --- linux-2.4.20/include/asm-mips/bitops.h Fri Nov 29 00:53:15 2002
3238 +++ linux-2.4.20-o1/include/asm-mips/bitops.h Wed Mar 12 00:41:43 2003
3241 #ifdef CONFIG_CPU_HAS_LLSC
3243 +#include <asm/mipsregs.h>
3246 * These functions for MIPS ISA > 1 are interrupt and SMP proof and
3247 * interrupt friendly
3250 : "=r" (res), "=r" (dummy), "=r" (addr)
3251 : "0" ((signed int) 0), "1" ((unsigned int) 0xffffffff),
3252 - "2" (addr), "r" (size));
3253 + "2" (addr), "r" (size)
3261 : "=r" (set), "=r" (dummy)
3262 - : "0" (0), "1" (1 << bit), "r" (*p));
3263 + : "0" (0), "1" (1 << bit), "r" (*p)
3265 if (set < (32 - bit))
3266 return set + offset;
3268 @@ -684,20 +688,29 @@
3270 * Undefined if no zero exists, so code should check against ~0UL first.
3272 -static __inline__ unsigned long ffz(unsigned long word)
3273 +extern __inline__ unsigned long ffz(unsigned long word)
3276 + unsigned int __res;
3277 + unsigned int mask = 1;
3280 - s = 16; if (word << 16 != 0) s = 0; b += s; word >>= s;
3281 - s = 8; if (word << 24 != 0) s = 0; b += s; word >>= s;
3282 - s = 4; if (word << 28 != 0) s = 0; b += s; word >>= s;
3283 - s = 2; if (word << 30 != 0) s = 0; b += s; word >>= s;
3284 - s = 1; if (word << 31 != 0) s = 0; b += s;
3286 + ".set\tnoreorder\n\t"
3289 + "1:\tand\t$1,%2,%1\n\t"
3297 + : "=&r" (__res), "=r" (mask)
3298 + : "r" (word), "1" (mask)
3308 diff -urN linux-2.4.20/include/asm-mips64/bitops.h linux-2.4.20-o1/include/asm-mips64/bitops.h
3309 --- linux-2.4.20/include/asm-mips64/bitops.h Fri Nov 29 00:53:15 2002
3310 +++ linux-2.4.20-o1/include/asm-mips64/bitops.h Wed Mar 12 00:41:43 2003
3313 #include <asm/system.h>
3314 #include <asm/sgidefs.h>
3315 +#include <asm/mipsregs.h>
3318 * set_bit - Atomically set a bit in memory
3320 * Note that @nr may be almost arbitrarily large; this function is not
3321 * restricted to acting on a single-word quantity.
3323 -static inline void set_bit(unsigned long nr, volatile void *addr)
3324 +extern __inline__ void
3325 +set_bit(unsigned long nr, volatile void *addr)
3327 unsigned long *m = ((unsigned long *) addr) + (nr >> 6);
3330 * If it's called on the same region of memory simultaneously, the effect
3331 * may be that only one operation succeeds.
3333 -static inline void __set_bit(int nr, volatile void * addr)
3334 +extern __inline__ void __set_bit(int nr, volatile void * addr)
3336 unsigned long * m = ((unsigned long *) addr) + (nr >> 6);
3339 * you should call smp_mb__before_clear_bit() and/or smp_mb__after_clear_bit()
3340 * in order to ensure changes are visible on other processors.
3342 -static inline void clear_bit(unsigned long nr, volatile void *addr)
3343 +extern __inline__ void
3344 +clear_bit(unsigned long nr, volatile void *addr)
3346 unsigned long *m = ((unsigned long *) addr) + (nr >> 6);
3349 * Note that @nr may be almost arbitrarily large; this function is not
3350 * restricted to acting on a single-word quantity.
3352 -static inline void change_bit(unsigned long nr, volatile void *addr)
3353 +extern __inline__ void
3354 +change_bit(unsigned long nr, volatile void *addr)
3356 unsigned long *m = ((unsigned long *) addr) + (nr >> 6);
3359 * If it's called on the same region of memory simultaneously, the effect
3360 * may be that only one operation succeeds.
3362 -static inline void __change_bit(int nr, volatile void * addr)
3363 +extern __inline__ void __change_bit(int nr, volatile void * addr)
3365 unsigned long * m = ((unsigned long *) addr) + (nr >> 6);
3368 * This operation is atomic and cannot be reordered.
3369 * It also implies a memory barrier.
3371 -static inline unsigned long test_and_set_bit(unsigned long nr,
3372 - volatile void *addr)
3373 +extern __inline__ unsigned long
3374 +test_and_set_bit(unsigned long nr, volatile void *addr)
3376 unsigned long *m = ((unsigned long *) addr) + (nr >> 6);
3377 unsigned long temp, res;
3379 * If two examples of this operation race, one can appear to succeed
3380 * but actually fail. You must protect multiple accesses with a lock.
3382 -static inline int __test_and_set_bit(int nr, volatile void *addr)
3383 +extern __inline__ int
3384 +__test_and_set_bit(int nr, volatile void * addr)
3386 unsigned long mask, retval;
3387 long *a = (unsigned long *) addr;
3389 * This operation is atomic and cannot be reordered.
3390 * It also implies a memory barrier.
3392 -static inline unsigned long test_and_clear_bit(unsigned long nr,
3393 - volatile void *addr)
3394 +extern __inline__ unsigned long
3395 +test_and_clear_bit(unsigned long nr, volatile void *addr)
3397 unsigned long *m = ((unsigned long *) addr) + (nr >> 6);
3398 unsigned long temp, res;
3400 * If two examples of this operation race, one can appear to succeed
3401 * but actually fail. You must protect multiple accesses with a lock.
3403 -static inline int __test_and_clear_bit(int nr, volatile void * addr)
3404 +extern __inline__ int
3405 +__test_and_clear_bit(int nr, volatile void * addr)
3407 unsigned long mask, retval;
3408 unsigned long *a = (unsigned long *) addr;
3410 * This operation is atomic and cannot be reordered.
3411 * It also implies a memory barrier.
3413 -static inline unsigned long test_and_change_bit(unsigned long nr,
3414 - volatile void *addr)
3415 +extern __inline__ unsigned long
3416 +test_and_change_bit(unsigned long nr, volatile void *addr)
3418 unsigned long *m = ((unsigned long *) addr) + (nr >> 6);
3419 unsigned long temp, res;
3421 * If two examples of this operation race, one can appear to succeed
3422 * but actually fail. You must protect multiple accesses with a lock.
3424 -static inline int __test_and_change_bit(int nr, volatile void *addr)
3425 +extern __inline__ int
3426 +__test_and_change_bit(int nr, volatile void * addr)
3428 unsigned long mask, retval;
3429 unsigned long *a = (unsigned long *) addr;
3431 * @nr: bit number to test
3432 * @addr: Address to start counting from
3434 -static inline unsigned long test_bit(int nr, volatile void * addr)
3435 +extern __inline__ unsigned long
3436 +test_bit(int nr, volatile void * addr)
3438 return 1UL & (((volatile unsigned long *) addr)[nr >> 6] >> (nr & 0x3f));
3441 * Returns the bit-number of the first zero bit, not the number of the byte
3444 -static inline int find_first_zero_bit (void *addr, unsigned size)
3445 +extern __inline__ int
3446 +find_first_zero_bit (void *addr, unsigned size)
3448 unsigned long dummy;
3452 : "=r" (res), "=r" (dummy), "=r" (addr)
3453 : "0" ((signed int) 0), "1" ((unsigned int) 0xffffffff),
3454 - "2" (addr), "r" (size));
3455 + "2" (addr), "r" (size)
3461 * @offset: The bitnumber to start searching at
3462 * @size: The maximum size to search
3464 -static inline int find_next_zero_bit (void * addr, int size, int offset)
3465 +extern __inline__ int
3466 +find_next_zero_bit (void * addr, int size, int offset)
3468 unsigned int *p = ((unsigned int *) addr) + (offset >> 5);
3469 int set = 0, bit = offset & 31, res;
3473 : "=r" (set), "=r" (dummy)
3474 - : "0" (0), "1" (1 << bit), "r" (*p));
3475 + : "0" (0), "1" (1 << bit), "r" (*p)
3477 if (set < (32 - bit))
3478 return set + offset;
3480 @@ -400,19 +412,20 @@
3482 * Undefined if no zero exists, so code should check against ~0UL first.
3484 -static __inline__ unsigned long ffz(unsigned long word)
3485 +extern __inline__ unsigned long ffz(unsigned long word)
3491 - s = 32; if (word << 32 != 0) s = 0; b += s; word >>= s;
3492 - s = 16; if (word << 48 != 0) s = 0; b += s; word >>= s;
3493 - s = 8; if (word << 56 != 0) s = 0; b += s; word >>= s;
3494 - s = 4; if (word << 60 != 0) s = 0; b += s; word >>= s;
3495 - s = 2; if (word << 62 != 0) s = 0; b += s; word >>= s;
3496 - s = 1; if (word << 63 != 0) s = 0; b += s;
3498 + if (word & 0x00000000ffffffffUL) { k -= 32; word <<= 32; }
3499 + if (word & 0x0000ffff00000000UL) { k -= 16; word <<= 16; }
3500 + if (word & 0x00ff000000000000UL) { k -= 8; word <<= 8; }
3501 + if (word & 0x0f00000000000000UL) { k -= 4; word <<= 4; }
3502 + if (word & 0x3000000000000000UL) { k -= 2; word <<= 2; }
3503 + if (word & 0x4000000000000000UL) { k -= 1; }
3511 * @offset: The bitnumber to start searching at
3512 * @size: The maximum size to search
3514 -static inline unsigned long find_next_zero_bit(void *addr, unsigned long size,
3515 - unsigned long offset)
3516 +extern __inline__ unsigned long
3517 +find_next_zero_bit(void *addr, unsigned long size, unsigned long offset)
3519 unsigned long *p = ((unsigned long *) addr) + (offset >> 6);
3520 unsigned long result = offset & ~63UL;
3525 -static inline int ext2_set_bit(int nr,void * addr)
3527 +ext2_set_bit(int nr,void * addr)
3529 int mask, retval, flags;
3530 unsigned char *ADDR = (unsigned char *) addr;
3535 -static inline int ext2_clear_bit(int nr, void * addr)
3537 +ext2_clear_bit(int nr, void * addr)
3539 int mask, retval, flags;
3540 unsigned char *ADDR = (unsigned char *) addr;
3545 -static inline int ext2_test_bit(int nr, const void * addr)
3547 +ext2_test_bit(int nr, const void * addr)
3550 const unsigned char *ADDR = (const unsigned char *) addr;
3552 #define ext2_find_first_zero_bit(addr, size) \
3553 ext2_find_next_zero_bit((addr), (size), 0)
3555 -static inline unsigned int ext2_find_next_zero_bit(void *addr,
3556 - unsigned long size,
3557 - unsigned long offset)
3558 +extern inline unsigned int
3559 +ext2_find_next_zero_bit(void *addr, unsigned long size, unsigned long offset)
3561 unsigned int *p = ((unsigned int *) addr) + (offset >> 5);
3562 unsigned int result = offset & ~31UL;
3563 diff -urN linux-2.4.20/include/asm-ppc/bitops.h linux-2.4.20-o1/include/asm-ppc/bitops.h
3564 --- linux-2.4.20/include/asm-ppc/bitops.h Tue Jun 12 04:15:27 2001
3565 +++ linux-2.4.20-o1/include/asm-ppc/bitops.h Wed Mar 12 00:41:43 2003
3568 - * BK Id: SCCS/s.bitops.h 1.9 05/26/01 14:48:14 paulus
3569 + * BK Id: %F% %I% %G% %U% %#%
3572 * bitops.h: Bit string operations on the ppc
3574 #define _PPC_BITOPS_H
3576 #include <linux/config.h>
3577 +#include <linux/compiler.h>
3578 #include <asm/byteorder.h>
3579 +#include <asm/atomic.h>
3581 +/* #ifdef CONFIG_IBM405_ERR77 */
3582 +#ifdef CONFIG_WALNUT
3583 +#define PPC405_ERR77(ra,rb) dcbt ra, rb;
3584 +#define PPC405_ERR77_SYNC sync;
3586 +#define PPC405_ERR77(ra,rb)
3587 +#define PPC405_ERR77_SYNC
3591 * The test_and_*_bit operations are taken to imply a memory barrier
3593 * These used to be if'd out here because using : "cc" as a constraint
3594 * resulted in errors from egcs. Things appear to be OK with gcc-2.95.
3596 -static __inline__ void set_bit(int nr, volatile void * addr)
3597 +static __inline__ void set_bit(int nr, volatile unsigned long * addr)
3600 unsigned long mask = 1 << (nr & 0x1f);
3603 __asm__ __volatile__("\n\
3604 1: lwarx %0,0,%3 \n\
3606 - stwcx. %0,0,%3 \n\
3608 + PPC405_ERR77(0,%3)
3609 +" stwcx. %0,0,%3 \n\
3611 : "=&r" (old), "=m" (*p)
3612 : "r" (mask), "r" (p), "m" (*p)
3615 * non-atomic version
3617 -static __inline__ void __set_bit(int nr, volatile void *addr)
3618 +static __inline__ void __set_bit(int nr, volatile unsigned long *addr)
3620 unsigned long mask = 1 << (nr & 0x1f);
3621 unsigned long *p = ((unsigned long *)addr) + (nr >> 5);
3623 #define smp_mb__before_clear_bit() smp_mb()
3624 #define smp_mb__after_clear_bit() smp_mb()
3626 -static __inline__ void clear_bit(int nr, volatile void *addr)
3627 +static __inline__ void clear_bit(int nr, volatile unsigned long *addr)
3630 unsigned long mask = 1 << (nr & 0x1f);
3633 __asm__ __volatile__("\n\
3634 1: lwarx %0,0,%3 \n\
3636 - stwcx. %0,0,%3 \n\
3638 + PPC405_ERR77(0,%3)
3639 +" stwcx. %0,0,%3 \n\
3641 : "=&r" (old), "=m" (*p)
3642 : "r" (mask), "r" (p), "m" (*p)
3645 * non-atomic version
3647 -static __inline__ void __clear_bit(int nr, volatile void *addr)
3648 +static __inline__ void __clear_bit(int nr, volatile unsigned long *addr)
3650 unsigned long mask = 1 << (nr & 0x1f);
3651 unsigned long *p = ((unsigned long *)addr) + (nr >> 5);
3656 -static __inline__ void change_bit(int nr, volatile void *addr)
3657 +static __inline__ void change_bit(int nr, volatile unsigned long *addr)
3660 unsigned long mask = 1 << (nr & 0x1f);
3663 __asm__ __volatile__("\n\
3664 1: lwarx %0,0,%3 \n\
3666 - stwcx. %0,0,%3 \n\
3668 + PPC405_ERR77(0,%3)
3669 +" stwcx. %0,0,%3 \n\
3671 : "=&r" (old), "=m" (*p)
3672 : "r" (mask), "r" (p), "m" (*p)
3675 * non-atomic version
3677 -static __inline__ void __change_bit(int nr, volatile void *addr)
3678 +static __inline__ void __change_bit(int nr, volatile unsigned long *addr)
3680 unsigned long mask = 1 << (nr & 0x1f);
3681 unsigned long *p = ((unsigned long *)addr) + (nr >> 5);
3684 * test_and_*_bit do imply a memory barrier (?)
3686 -static __inline__ int test_and_set_bit(int nr, volatile void *addr)
3687 +static __inline__ int test_and_set_bit(int nr, volatile unsigned long *addr)
3689 unsigned int old, t;
3690 unsigned int mask = 1 << (nr & 0x1f);
3693 __asm__ __volatile__(SMP_WMB "\n\
3694 1: lwarx %0,0,%4 \n\
3696 - stwcx. %1,0,%4 \n\
3698 + PPC405_ERR77(0,%4)
3699 +" stwcx. %1,0,%4 \n\
3702 : "=&r" (old), "=&r" (t), "=m" (*p)
3705 * non-atomic version
3707 -static __inline__ int __test_and_set_bit(int nr, volatile void *addr)
3708 +static __inline__ int __test_and_set_bit(int nr, volatile unsigned long *addr)
3710 unsigned long mask = 1 << (nr & 0x1f);
3711 unsigned long *p = ((unsigned long *)addr) + (nr >> 5);
3713 return (old & mask) != 0;
3716 -static __inline__ int test_and_clear_bit(int nr, volatile void *addr)
3717 +static __inline__ int test_and_clear_bit(int nr, volatile unsigned long *addr)
3719 unsigned int old, t;
3720 unsigned int mask = 1 << (nr & 0x1f);
3723 __asm__ __volatile__(SMP_WMB "\n\
3724 1: lwarx %0,0,%4 \n\
3726 - stwcx. %1,0,%4 \n\
3728 + PPC405_ERR77(0,%4)
3729 +" stwcx. %1,0,%4 \n\
3732 : "=&r" (old), "=&r" (t), "=m" (*p)
3735 * non-atomic version
3737 -static __inline__ int __test_and_clear_bit(int nr, volatile void *addr)
3738 +static __inline__ int __test_and_clear_bit(int nr, volatile unsigned long *addr)
3740 unsigned long mask = 1 << (nr & 0x1f);
3741 unsigned long *p = ((unsigned long *)addr) + (nr >> 5);
3743 return (old & mask) != 0;
3746 -static __inline__ int test_and_change_bit(int nr, volatile void *addr)
3747 +static __inline__ int test_and_change_bit(int nr, volatile unsigned long *addr)
3749 unsigned int old, t;
3750 unsigned int mask = 1 << (nr & 0x1f);
3753 __asm__ __volatile__(SMP_WMB "\n\
3754 1: lwarx %0,0,%4 \n\
3756 - stwcx. %1,0,%4 \n\
3758 + PPC405_ERR77(0,%4)
3759 +" stwcx. %1,0,%4 \n\
3762 : "=&r" (old), "=&r" (t), "=m" (*p)
3765 * non-atomic version
3767 -static __inline__ int __test_and_change_bit(int nr, volatile void *addr)
3768 +static __inline__ int __test_and_change_bit(int nr, volatile unsigned long *addr)
3770 unsigned long mask = 1 << (nr & 0x1f);
3771 unsigned long *p = ((unsigned long *)addr) + (nr >> 5);
3773 return (old & mask) != 0;
3776 -static __inline__ int test_bit(int nr, __const__ volatile void *addr)
3777 +static __inline__ int test_bit(int nr, __const__ volatile unsigned long *addr)
3779 __const__ unsigned int *p = (__const__ unsigned int *) addr;
3784 /* Return the bit position of the most significant 1 bit in a word */
3785 -static __inline__ int __ilog2(unsigned int x)
3786 +static __inline__ int __ilog2(unsigned long x)
3794 -static __inline__ int ffz(unsigned int x)
3795 +static __inline__ int ffz(unsigned long x)
3799 @@ -239,6 +247,11 @@
3803 +static inline int __ffs(unsigned long x)
3805 + return __ilog2(x & -x);
3809 * ffs: find first bit set. This is defined the same way as
3810 * the libc and compiler builtin ffs routines, therefore
3811 @@ -250,6 +263,18 @@
3815 + * fls: find last (most-significant) bit set.
3816 + * Note fls(0) = 0, fls(1) = 1, fls(0x80000000) = 32.
3818 +static __inline__ int fls(unsigned int x)
3822 + asm ("cntlzw %0,%1" : "=r" (lz) : "r" (x));
3827 * hweightN: returns the hamming weight (i.e. the number
3828 * of bits set) of a N-bit word
3830 @@ -261,13 +286,86 @@
3831 #endif /* __KERNEL__ */
3834 + * Find the first bit set in a 140-bit bitmap.
3835 + * The first 100 bits are unlikely to be set.
3837 +static inline int sched_find_first_bit(unsigned long *b)
3839 + if (unlikely(b[0]))
3840 + return __ffs(b[0]);
3841 + if (unlikely(b[1]))
3842 + return __ffs(b[1]) + 32;
3843 + if (unlikely(b[2]))
3844 + return __ffs(b[2]) + 64;
3846 + return __ffs(b[3]) + 96;
3847 + return __ffs(b[4]) + 128;
3851 + * find_next_bit - find the next set bit in a memory region
3852 + * @addr: The address to base the search on
3853 + * @offset: The bitnumber to start searching at
3854 + * @size: The maximum size to search
3856 +static __inline__ unsigned long find_next_bit(unsigned long *addr,
3857 + unsigned long size, unsigned long offset)
3859 + unsigned int *p = ((unsigned int *) addr) + (offset >> 5);
3860 + unsigned int result = offset & ~31UL;
3863 + if (offset >= size)
3869 + tmp &= ~0UL << offset;
3873 + goto found_middle;
3877 + while (size >= 32) {
3878 + if ((tmp = *p++) != 0)
3879 + goto found_middle;
3888 + tmp &= ~0UL >> (32 - size);
3889 + if (tmp == 0UL) /* Are any bits set? */
3890 + return result + size; /* Nope. */
3892 + return result + __ffs(tmp);
3896 + * find_first_bit - find the first set bit in a memory region
3897 + * @addr: The address to start the search at
3898 + * @size: The maximum size to search
3900 + * Returns the bit-number of the first set bit, not the number of the byte
3901 + * containing a bit.
3903 +#define find_first_bit(addr, size) \
3904 + find_next_bit((addr), (size), 0)
3907 * This implementation of find_{first,next}_zero_bit was stolen from
3908 * Linus' asm-alpha/bitops.h.
3910 #define find_first_zero_bit(addr, size) \
3911 find_next_zero_bit((addr), (size), 0)
3913 -static __inline__ unsigned long find_next_zero_bit(void * addr,
3914 +static __inline__ unsigned long find_next_zero_bit(unsigned long * addr,
3915 unsigned long size, unsigned long offset)
3917 unsigned int * p = ((unsigned int *) addr) + (offset >> 5);
3922 -#define ext2_set_bit(nr, addr) __test_and_set_bit((nr) ^ 0x18, addr)
3923 -#define ext2_clear_bit(nr, addr) __test_and_clear_bit((nr) ^ 0x18, addr)
3924 +#define ext2_set_bit(nr, addr) __test_and_set_bit((nr) ^ 0x18, (unsigned long *)(addr))
3925 +#define ext2_clear_bit(nr, addr) __test_and_clear_bit((nr) ^ 0x18, (unsigned long *)(addr))
3927 static __inline__ int ext2_test_bit(int nr, __const__ void * addr)
3929 diff -urN linux-2.4.20/include/asm-ppc/smp.h linux-2.4.20-o1/include/asm-ppc/smp.h
3930 --- linux-2.4.20/include/asm-ppc/smp.h Sat Aug 3 02:39:45 2002
3931 +++ linux-2.4.20-o1/include/asm-ppc/smp.h Wed Mar 12 14:29:05 2003
3933 #define cpu_logical_map(cpu) (cpu)
3934 #define cpu_number_map(x) (x)
3936 -#define smp_processor_id() (current->processor)
3937 +#define smp_processor_id() (current->cpu)
3939 extern int smp_hw_index[NR_CPUS];
3940 #define hard_smp_processor_id() (smp_hw_index[smp_processor_id()])
3941 diff -urN linux-2.4.20/include/asm-ppc64/bitops.h linux-2.4.20-o1/include/asm-ppc64/bitops.h
3942 --- linux-2.4.20/include/asm-ppc64/bitops.h Sat Aug 3 02:39:45 2002
3943 +++ linux-2.4.20-o1/include/asm-ppc64/bitops.h Wed Mar 12 00:41:43 2003
3948 -#include <asm/byteorder.h>
3949 #include <asm/memory.h>
3953 #define smp_mb__before_clear_bit() smp_mb()
3954 #define smp_mb__after_clear_bit() smp_mb()
3956 -static __inline__ int test_bit(unsigned long nr, __const__ volatile void *addr)
3957 +static __inline__ int test_bit(unsigned long nr, __const__ volatile unsigned long *addr)
3959 return (1UL & (((__const__ long *) addr)[nr >> 6] >> (nr & 63)));
3962 -static __inline__ void set_bit(unsigned long nr, volatile void *addr)
3963 +static __inline__ void set_bit(unsigned long nr, volatile unsigned long *addr)
3966 unsigned long mask = 1UL << (nr & 0x3f);
3971 -static __inline__ void clear_bit(unsigned long nr, volatile void *addr)
3972 +static __inline__ void clear_bit(unsigned long nr, volatile unsigned long *addr)
3975 unsigned long mask = 1UL << (nr & 0x3f);
3980 -static __inline__ void change_bit(unsigned long nr, volatile void *addr)
3981 +static __inline__ void change_bit(unsigned long nr, volatile unsigned long *addr)
3984 unsigned long mask = 1UL << (nr & 0x3f);
3989 -static __inline__ int test_and_set_bit(unsigned long nr, volatile void *addr)
3990 +static __inline__ int test_and_set_bit(unsigned long nr, volatile unsigned long *addr)
3992 unsigned long old, t;
3993 unsigned long mask = 1UL << (nr & 0x3f);
3995 return (old & mask) != 0;
3998 -static __inline__ int test_and_clear_bit(unsigned long nr, volatile void *addr)
3999 +static __inline__ int test_and_clear_bit(unsigned long nr, volatile unsigned long *addr)
4001 unsigned long old, t;
4002 unsigned long mask = 1UL << (nr & 0x3f);
4004 return (old & mask) != 0;
4007 -static __inline__ int test_and_change_bit(unsigned long nr, volatile void *addr)
4008 +static __inline__ int test_and_change_bit(unsigned long nr, volatile unsigned long *addr)
4010 unsigned long old, t;
4011 unsigned long mask = 1UL << (nr & 0x3f);
4014 * non-atomic versions
4016 -static __inline__ void __set_bit(unsigned long nr, volatile void *addr)
4017 +static __inline__ void __set_bit(unsigned long nr, volatile unsigned long *addr)
4019 unsigned long mask = 1UL << (nr & 0x3f);
4020 unsigned long *p = ((unsigned long *)addr) + (nr >> 6);
4025 -static __inline__ void __clear_bit(unsigned long nr, volatile void *addr)
4026 +static __inline__ void __clear_bit(unsigned long nr, volatile unsigned long *addr)
4028 unsigned long mask = 1UL << (nr & 0x3f);
4029 unsigned long *p = ((unsigned long *)addr) + (nr >> 6);
4034 -static __inline__ void __change_bit(unsigned long nr, volatile void *addr)
4035 +static __inline__ void __change_bit(unsigned long nr, volatile unsigned long *addr)
4037 unsigned long mask = 1UL << (nr & 0x3f);
4038 unsigned long *p = ((unsigned long *)addr) + (nr >> 6);
4043 -static __inline__ int __test_and_set_bit(unsigned long nr, volatile void *addr)
4044 +static __inline__ int __test_and_set_bit(unsigned long nr, volatile unsigned long *addr)
4046 unsigned long mask = 1UL << (nr & 0x3f);
4047 unsigned long *p = ((unsigned long *)addr) + (nr >> 6);
4049 return (old & mask) != 0;
4052 -static __inline__ int __test_and_clear_bit(unsigned long nr, volatile void *addr)
4053 +static __inline__ int __test_and_clear_bit(unsigned long nr, volatile unsigned long *addr)
4055 unsigned long mask = 1UL << (nr & 0x3f);
4056 unsigned long *p = ((unsigned long *)addr) + (nr >> 6);
4058 return (old & mask) != 0;
4061 -static __inline__ int __test_and_change_bit(unsigned long nr, volatile void *addr)
4062 +static __inline__ int __test_and_change_bit(unsigned long nr, volatile unsigned long *addr)
4064 unsigned long mask = 1UL << (nr & 0x3f);
4065 unsigned long *p = ((unsigned long *)addr) + (nr >> 6);
4066 @@ -224,54 +223,29 @@
4070 -/* Return the zero-based bit position
4071 - * from RIGHT TO LEFT 63 --> 0
4072 - * of the most significant (left-most) 1-bit in an 8-byte area.
4074 -static __inline__ long cnt_trailing_zeros(unsigned long mask)
4079 -" addi %0,%1,-1 \n\
4091 - * ffz = Find First Zero in word. Undefined if no zero exists,
4092 - * Determines the bit position of the LEAST significant
4093 - * (rightmost) 0 bit in the specified DOUBLE-WORD.
4094 - * The returned bit position will be zero-based, starting
4095 - * from the right side (63 - 0).
4096 - * the code should check against ~0UL first..
4097 + * Determines the bit position of the least significant (rightmost) 0 bit
4098 + * in the specified double word. The returned bit position will be zero-based,
4099 + * starting from the right side (63 - 0).
4101 static __inline__ unsigned long ffz(unsigned long x)
4105 - /* Change all of x's 1s to 0s and 0s to 1s in x.
4106 - * And insure at least 1 zero exists in the 8 byte area.
4108 + /* no zero exists anywhere in the 8 byte area. */
4110 - /* no zero exists anywhere in the 8 byte area. */
4113 - /* Calculate the bit position of the least significant '1' bit in x
4114 - * (since x has been changed this will actually be the least
4115 - * significant '0' bit in the original x).
4116 - * Note: (x & -x) gives us a mask that is the LEAST significant
4117 - * (RIGHT-most) 1-bit of the value in x.
4119 + * Calculate the bit position of the least signficant '1' bit in x
4120 + * (since x has been changed this will actually be the least signficant
4121 + * '0' bit in * the original x). Note: (x & -x) gives us a mask that
4122 + * is the least significant * (RIGHT-most) 1-bit of the value in x.
4124 - tempRC = __ilog2(x & -x);
4125 + return __ilog2(x & -x);
4129 +static __inline__ int __ffs(unsigned long x)
4131 + return __ilog2(x & -x);
4137 static __inline__ int ffs(int x)
4139 - int result = ffz(~x);
4140 - return x ? result+1 : 0;
4141 + unsigned long i = (unsigned long)x;
4142 + return __ilog2(i & -i) + 1;
4146 @@ -293,139 +267,82 @@
4147 #define hweight16(x) generic_hweight16(x)
4148 #define hweight8(x) generic_hweight8(x)
4150 -extern unsigned long find_next_zero_bit(void * addr, unsigned long size,
4151 - unsigned long offset);
4153 - * The optimizer actually does good code for this case..
4155 -#define find_first_zero_bit(addr, size) find_next_zero_bit((addr), (size), 0)
4156 +extern unsigned long find_next_zero_bit(unsigned long *addr, unsigned long size, unsigned long offset);
4157 +#define find_first_zero_bit(addr, size) \
4158 + find_next_zero_bit((addr), (size), 0)
4160 +extern unsigned long find_next_bit(unsigned long *addr, unsigned long size, unsigned long offset);
4161 +#define find_first_bit(addr, size) \
4162 + find_next_bit((addr), (size), 0)
4164 +extern unsigned long find_next_zero_le_bit(unsigned long *addr, unsigned long size, unsigned long offset);
4165 +#define find_first_zero_le_bit(addr, size) \
4166 + find_next_zero_le_bit((addr), (size), 0)
4168 -/* Bitmap functions for the ext2 filesystem. */
4169 -#define _EXT2_HAVE_ASM_BITOPS_
4171 -static __inline__ int ext2_set_bit(int nr, void* addr)
4172 +static __inline__ int test_le_bit(unsigned long nr, __const__ unsigned long * addr)
4174 - /* This method needs to take into account the fact that the ext2 file system represents
4175 - * it's bitmaps as "little endian" unsigned integers.
4176 - * Note: this method is not atomic, but ext2 does not need it to be.
4180 - unsigned char* ADDR = (unsigned char*) addr;
4182 - /* Determine the BYTE containing the specified bit
4183 - * (nr) - important as if we go to a byte there are no
4184 - * little endian concerns.
4187 - mask = 1 << (nr & 0x07); /* Create a mask to the bit within this byte. */
4188 - oldbit = *ADDR & mask; /* Save the bit's previous value. */
4189 - *ADDR |= mask; /* Turn the bit on. */
4190 - return oldbit; /* Return the bit's previous value. */
4191 + __const__ unsigned char *ADDR = (__const__ unsigned char *) addr;
4192 + return (ADDR[nr >> 3] >> (nr & 7)) & 1;
4195 -static __inline__ int ext2_clear_bit(int nr, void* addr)
4197 + * non-atomic versions
4199 +static __inline__ void __set_le_bit(unsigned long nr, unsigned long *addr)
4201 - /* This method needs to take into account the fact that the ext2 file system represents
4202 - * | it's bitmaps as "little endian" unsigned integers.
4203 - * Note: this method is not atomic, but ext2 does not need it to be.
4207 - unsigned char* ADDR = (unsigned char*) addr;
4209 - /* Determine the BYTE containing the specified bit (nr)
4210 - * - important as if we go to a byte there are no little endian concerns.
4213 - mask = 1 << (nr & 0x07); /* Create a mask to the bit within this byte. */
4214 - oldbit = *ADDR & mask; /* Save the bit's previous value. */
4215 - *ADDR = *ADDR & ~mask; /* Turn the bit off. */
4216 - return oldbit; /* Return the bit's previous value. */
4218 + unsigned char *ADDR = (unsigned char *)addr;
4220 -static __inline__ int ext2_test_bit(int nr, __const__ void * addr)
4222 - /* This method needs to take into account the fact that the ext2 file system represents
4223 - * | it's bitmaps as "little endian" unsigned integers.
4224 - * Determine the BYTE containing the specified bit (nr),
4225 - * then shift to the right the correct number of bits and return that bit's value.
4227 - __const__ unsigned char *ADDR = (__const__ unsigned char *) addr;
4228 - return (ADDR[nr >> 3] >> (nr & 7)) & 1;
4230 + *ADDR |= 1 << (nr & 0x07);
4233 -/* Returns the bit position of the most significant 1 bit in a WORD. */
4234 -static __inline__ int ext2_ilog2(unsigned int x)
4235 +static __inline__ void __clear_le_bit(unsigned long nr, unsigned long *addr)
4238 + unsigned char *ADDR = (unsigned char *)addr;
4240 - asm ("cntlzw %0,%1" : "=r" (lz) : "r" (x));
4243 + *ADDR &= ~(1 << (nr & 0x07));
4246 -/* ext2_ffz = ext2's Find First Zero.
4247 - * Determines the bit position of the LEAST significant (rightmost) 0 bit in the specified WORD.
4248 - * The returned bit position will be zero-based, starting from the right side (31 - 0).
4250 -static __inline__ int ext2_ffz(unsigned int x)
4251 +static __inline__ int __test_and_set_le_bit(unsigned long nr, unsigned long *addr)
4254 - /* Change all of x's 1s to 0s and 0s to 1s in x. And insure at least 1 zero exists in the word. */
4255 - if ((x = ~x) == 0)
4256 - /* no zero exists anywhere in the 4 byte area. */
4258 - /* Calculate the bit position of the least significant '1' bit in x
4259 - * (since x has been changed this will actually be the least
4260 - * significant '0' bit in the original x).
4261 - * Note: (x & -x) gives us a mask that is the LEAST significant
4262 - * (RIGHT-most) 1-bit of the value in x.
4264 - tempRC = ext2_ilog2(x & -x);
4267 + unsigned char *ADDR = (unsigned char *)addr;
4270 + mask = 1 << (nr & 0x07);
4271 + retval = (mask & *ADDR) != 0;
4276 -static __inline__ u32 ext2_find_next_zero_bit(void* addr, u32 size, u32 offset)
4277 +static __inline__ int __test_and_clear_le_bit(unsigned long nr, unsigned long *addr)
4279 - /* This method needs to take into account the fact that the ext2 file system represents
4280 - * | it's bitmaps as "little endian" unsigned integers.
4282 - unsigned int *p = ((unsigned int *) addr) + (offset >> 5);
4283 - unsigned int result = offset & ~31;
4286 - if (offset >= size)
4291 - tmp = cpu_to_le32p(p++);
4292 - tmp |= ~0U >> (32-offset); /* bug or feature ? */
4296 - goto found_middle;
4300 - while (size >= 32) {
4301 - if ((tmp = cpu_to_le32p(p++)) != ~0)
4302 - goto found_middle;
4308 - tmp = cpu_to_le32p(p);
4310 - tmp |= ~0 << size;
4311 - if (tmp == ~0) /* Are any bits zero? */
4312 - return result + size; /* Nope. */
4314 - return result + ext2_ffz(tmp);
4317 + unsigned char *ADDR = (unsigned char *)addr;
4319 -#define ext2_find_first_zero_bit(addr, size) ext2_find_next_zero_bit((addr), (size), 0)
4321 + mask = 1 << (nr & 0x07);
4322 + retval = (mask & *ADDR) != 0;
4327 +#define ext2_set_bit(nr,addr) \
4328 + __test_and_set_le_bit((nr),(unsigned long*)addr)
4329 +#define ext2_clear_bit(nr, addr) \
4330 + __test_and_clear_le_bit((nr),(unsigned long*)addr)
4331 +#define ext2_test_bit(nr, addr) test_le_bit((nr),(unsigned long*)addr)
4332 +#define ext2_find_first_zero_bit(addr, size) \
4333 + find_first_zero_le_bit((unsigned long*)addr, size)
4334 +#define ext2_find_next_zero_bit(addr, size, off) \
4335 + find_next_zero_le_bit((unsigned long*)addr, size, off)
4337 +#define minix_test_and_set_bit(nr,addr) test_and_set_bit(nr,addr)
4338 +#define minix_set_bit(nr,addr) set_bit(nr,addr)
4339 +#define minix_test_and_clear_bit(nr,addr) test_and_clear_bit(nr,addr)
4340 +#define minix_test_bit(nr,addr) test_bit(nr,addr)
4341 +#define minix_find_first_zero_bit(addr,size) find_first_zero_bit(addr,size)
4343 #endif /* __KERNEL__ */
4344 #endif /* _PPC64_BITOPS_H */
4345 diff -urN linux-2.4.20/include/asm-s390/bitops.h linux-2.4.20-o1/include/asm-s390/bitops.h
4346 --- linux-2.4.20/include/asm-s390/bitops.h Sat Aug 3 02:39:45 2002
4347 +++ linux-2.4.20-o1/include/asm-s390/bitops.h Wed Mar 12 00:41:43 2003
4348 @@ -47,272 +47,217 @@
4349 extern const char _oi_bitmap[];
4350 extern const char _ni_bitmap[];
4351 extern const char _zb_findmap[];
4352 +extern const char _sb_findmap[];
4356 * SMP save set_bit routine based on compare and swap (CS)
4358 -static __inline__ void set_bit_cs(int nr, volatile void * addr)
4359 +static inline void set_bit_cs(int nr, volatile void *ptr)
4361 - unsigned long bits, mask;
4362 - __asm__ __volatile__(
4363 + unsigned long addr, old, new, mask;
4365 + addr = (unsigned long) ptr;
4367 - " lhi %2,3\n" /* CS must be aligned on 4 byte b. */
4368 - " nr %2,%1\n" /* isolate last 2 bits of address */
4369 - " xr %1,%2\n" /* make addr % 4 == 0 */
4371 - " ar %0,%2\n" /* add alignement to bitnr */
4372 + addr ^= addr & 3; /* align address to 4 */
4373 + nr += (addr & 3) << 3; /* add alignment to bit number */
4376 - " nr %2,%0\n" /* make shift value */
4380 - " la %1,0(%0,%1)\n" /* calc. address for CS */
4381 - " sll %3,0(%2)\n" /* make OR mask */
4383 - "0: lr %2,%0\n" /* CS loop starts here */
4384 - " or %2,%3\n" /* set bit */
4385 - " cs %0,%2,0(%1)\n"
4387 - : "+a" (nr), "+a" (addr), "=&a" (bits), "=&d" (mask) :
4388 - : "cc", "memory" );
4389 + addr += (nr ^ (nr & 31)) >> 3; /* calculate address for CS */
4390 + mask = 1UL << (nr & 31); /* make OR mask */
4395 + " cs %0,%1,0(%4)\n"
4397 + : "=&d" (old), "=&d" (new), "+m" (*(unsigned int *) addr)
4398 + : "d" (mask), "a" (addr)
4403 * SMP save clear_bit routine based on compare and swap (CS)
4405 -static __inline__ void clear_bit_cs(int nr, volatile void * addr)
4406 +static inline void clear_bit_cs(int nr, volatile void *ptr)
4408 - static const int minusone = -1;
4409 - unsigned long bits, mask;
4410 - __asm__ __volatile__(
4411 + unsigned long addr, old, new, mask;
4413 + addr = (unsigned long) ptr;
4415 - " lhi %2,3\n" /* CS must be aligned on 4 byte b. */
4416 - " nr %2,%1\n" /* isolate last 2 bits of address */
4417 - " xr %1,%2\n" /* make addr % 4 == 0 */
4419 - " ar %0,%2\n" /* add alignement to bitnr */
4420 + addr ^= addr & 3; /* align address to 4 */
4421 + nr += (addr & 3) << 3; /* add alignment to bit number */
4424 - " nr %2,%0\n" /* make shift value */
4428 - " la %1,0(%0,%1)\n" /* calc. address for CS */
4430 - " x %3,%4\n" /* make AND mask */
4432 - "0: lr %2,%0\n" /* CS loop starts here */
4433 - " nr %2,%3\n" /* clear bit */
4434 - " cs %0,%2,0(%1)\n"
4436 - : "+a" (nr), "+a" (addr), "=&a" (bits), "=&d" (mask)
4437 - : "m" (minusone) : "cc", "memory" );
4438 + addr += (nr ^ (nr & 31)) >> 3; /* calculate address for CS */
4439 + mask = ~(1UL << (nr & 31)); /* make AND mask */
4444 + " cs %0,%1,0(%4)\n"
4446 + : "=&d" (old), "=&d" (new), "+m" (*(unsigned int *) addr)
4447 + : "d" (mask), "a" (addr)
4452 * SMP save change_bit routine based on compare and swap (CS)
4454 -static __inline__ void change_bit_cs(int nr, volatile void * addr)
4455 +static inline void change_bit_cs(int nr, volatile void *ptr)
4457 - unsigned long bits, mask;
4458 - __asm__ __volatile__(
4459 + unsigned long addr, old, new, mask;
4461 + addr = (unsigned long) ptr;
4463 - " lhi %2,3\n" /* CS must be aligned on 4 byte b. */
4464 - " nr %2,%1\n" /* isolate last 2 bits of address */
4465 - " xr %1,%2\n" /* make addr % 4 == 0 */
4467 - " ar %0,%2\n" /* add alignement to bitnr */
4468 + addr ^= addr & 3; /* align address to 4 */
4469 + nr += (addr & 3) << 3; /* add alignment to bit number */
4472 - " nr %2,%0\n" /* make shift value */
4476 - " la %1,0(%0,%1)\n" /* calc. address for CS */
4477 - " sll %3,0(%2)\n" /* make XR mask */
4479 - "0: lr %2,%0\n" /* CS loop starts here */
4480 - " xr %2,%3\n" /* change bit */
4481 - " cs %0,%2,0(%1)\n"
4483 - : "+a" (nr), "+a" (addr), "=&a" (bits), "=&d" (mask) :
4484 - : "cc", "memory" );
4485 + addr += (nr ^ (nr & 31)) >> 3; /* calculate address for CS */
4486 + mask = 1UL << (nr & 31); /* make XOR mask */
4491 + " cs %0,%1,0(%4)\n"
4493 + : "=&d" (old), "=&d" (new), "+m" (*(unsigned int *) addr)
4494 + : "d" (mask), "a" (addr)
4499 * SMP save test_and_set_bit routine based on compare and swap (CS)
4501 -static __inline__ int test_and_set_bit_cs(int nr, volatile void * addr)
4502 +static inline int test_and_set_bit_cs(int nr, volatile void *ptr)
4504 - unsigned long bits, mask;
4505 - __asm__ __volatile__(
4506 + unsigned long addr, old, new, mask;
4508 + addr = (unsigned long) ptr;
4510 - " lhi %2,3\n" /* CS must be aligned on 4 byte b. */
4511 - " nr %2,%1\n" /* isolate last 2 bits of address */
4512 - " xr %1,%2\n" /* make addr % 4 == 0 */
4514 - " ar %0,%2\n" /* add alignement to bitnr */
4515 + addr ^= addr & 3; /* align address to 4 */
4516 + nr += (addr & 3) << 3; /* add alignment to bit number */
4519 - " nr %2,%0\n" /* make shift value */
4523 - " la %1,0(%0,%1)\n" /* calc. address for CS */
4524 - " sll %3,0(%2)\n" /* make OR mask */
4526 - "0: lr %2,%0\n" /* CS loop starts here */
4527 - " or %2,%3\n" /* set bit */
4528 - " cs %0,%2,0(%1)\n"
4530 - " nr %0,%3\n" /* isolate old bit */
4531 - : "+a" (nr), "+a" (addr), "=&a" (bits), "=&d" (mask) :
4532 - : "cc", "memory" );
4534 + addr += (nr ^ (nr & 31)) >> 3; /* calculate address for CS */
4535 + mask = 1UL << (nr & 31); /* make OR/test mask */
4540 + " cs %0,%1,0(%4)\n"
4542 + : "=&d" (old), "=&d" (new), "+m" (*(unsigned int *) addr)
4543 + : "d" (mask), "a" (addr)
4545 + return (old & mask) != 0;
4549 * SMP save test_and_clear_bit routine based on compare and swap (CS)
4551 -static __inline__ int test_and_clear_bit_cs(int nr, volatile void * addr)
4552 +static inline int test_and_clear_bit_cs(int nr, volatile void *ptr)
4554 - static const int minusone = -1;
4555 - unsigned long bits, mask;
4556 - __asm__ __volatile__(
4557 + unsigned long addr, old, new, mask;
4559 + addr = (unsigned long) ptr;
4561 - " lhi %2,3\n" /* CS must be aligned on 4 byte b. */
4562 - " nr %2,%1\n" /* isolate last 2 bits of address */
4563 - " xr %1,%2\n" /* make addr % 4 == 0 */
4565 - " ar %0,%2\n" /* add alignement to bitnr */
4566 + addr ^= addr & 3; /* align address to 4 */
4567 + nr += (addr & 3) << 3; /* add alignment to bit number */
4570 - " nr %2,%0\n" /* make shift value */
4574 - " la %1,0(%0,%1)\n" /* calc. address for CS */
4577 - " x %3,%4\n" /* make AND mask */
4578 - "0: lr %2,%0\n" /* CS loop starts here */
4579 - " nr %2,%3\n" /* clear bit */
4580 - " cs %0,%2,0(%1)\n"
4583 - " nr %0,%3\n" /* isolate old bit */
4584 - : "+a" (nr), "+a" (addr), "=&a" (bits), "=&d" (mask)
4585 - : "m" (minusone) : "cc", "memory" );
4587 + addr += (nr ^ (nr & 31)) >> 3; /* calculate address for CS */
4588 + mask = ~(1UL << (nr & 31)); /* make AND mask */
4593 + " cs %0,%1,0(%4)\n"
4595 + : "=&d" (old), "=&d" (new), "+m" (*(unsigned int *) addr)
4596 + : "d" (mask), "a" (addr)
4598 + return (old ^ new) != 0;
4602 * SMP save test_and_change_bit routine based on compare and swap (CS)
4604 -static __inline__ int test_and_change_bit_cs(int nr, volatile void * addr)
4605 +static inline int test_and_change_bit_cs(int nr, volatile void *ptr)
4607 - unsigned long bits, mask;
4608 - __asm__ __volatile__(
4609 + unsigned long addr, old, new, mask;
4611 + addr = (unsigned long) ptr;
4613 - " lhi %2,3\n" /* CS must be aligned on 4 byte b. */
4614 - " nr %2,%1\n" /* isolate last 2 bits of address */
4615 - " xr %1,%2\n" /* make addr % 4 == 0 */
4617 - " ar %0,%2\n" /* add alignement to bitnr */
4618 + addr ^= addr & 3; /* align address to 4 */
4619 + nr += (addr & 3) << 3; /* add alignment to bit number */
4622 - " nr %2,%0\n" /* make shift value */
4626 - " la %1,0(%0,%1)\n" /* calc. address for CS */
4627 - " sll %3,0(%2)\n" /* make OR mask */
4629 - "0: lr %2,%0\n" /* CS loop starts here */
4630 - " xr %2,%3\n" /* change bit */
4631 - " cs %0,%2,0(%1)\n"
4633 - " nr %0,%3\n" /* isolate old bit */
4634 - : "+a" (nr), "+a" (addr), "=&a" (bits), "=&d" (mask) :
4635 - : "cc", "memory" );
4637 + addr += (nr ^ (nr & 31)) >> 3; /* calculate address for CS */
4638 + mask = 1UL << (nr & 31); /* make XOR mask */
4643 + " cs %0,%1,0(%4)\n"
4645 + : "=&d" (old), "=&d" (new), "+m" (*(unsigned int *) addr)
4646 + : "d" (mask), "a" (addr)
4648 + return (old & mask) != 0;
4650 #endif /* CONFIG_SMP */
4653 * fast, non-SMP set_bit routine
4655 -static __inline__ void __set_bit(int nr, volatile void * addr)
4656 +static inline void __set_bit(int nr, volatile void *ptr)
4658 - unsigned long reg1, reg2;
4659 - __asm__ __volatile__(
4665 - " la %1,0(%1,%3)\n"
4666 - " la %0,0(%0,%4)\n"
4667 - " oc 0(1,%1),0(%0)"
4668 - : "=&a" (reg1), "=&a" (reg2)
4669 - : "r" (nr), "a" (addr), "a" (&_oi_bitmap) : "cc", "memory" );
4672 -static __inline__ void
4673 -__constant_set_bit(const int nr, volatile void * addr)
4677 - __asm__ __volatile__ ("la 1,%0\n\t"
4679 - : "=m" (*((volatile char *) addr + ((nr>>3)^3)))
4680 - : : "1", "cc", "memory");
4683 - __asm__ __volatile__ ("la 1,%0\n\t"
4685 - : "=m" (*((volatile char *) addr + ((nr>>3)^3)))
4686 - : : "1", "cc", "memory" );
4689 - __asm__ __volatile__ ("la 1,%0\n\t"
4691 - : "=m" (*((volatile char *) addr + ((nr>>3)^3)))
4692 - : : "1", "cc", "memory" );
4695 - __asm__ __volatile__ ("la 1,%0\n\t"
4697 - : "=m" (*((volatile char *) addr + ((nr>>3)^3)))
4698 - : : "1", "cc", "memory" );
4701 - __asm__ __volatile__ ("la 1,%0\n\t"
4703 - : "=m" (*((volatile char *) addr + ((nr>>3)^3)))
4704 - : : "1", "cc", "memory" );
4707 - __asm__ __volatile__ ("la 1,%0\n\t"
4709 - : "=m" (*((volatile char *) addr + ((nr>>3)^3)))
4710 - : : "1", "cc", "memory" );
4713 - __asm__ __volatile__ ("la 1,%0\n\t"
4715 - : "=m" (*((volatile char *) addr + ((nr>>3)^3)))
4716 - : : "1", "cc", "memory" );
4719 - __asm__ __volatile__ ("la 1,%0\n\t"
4721 - : "=m" (*((volatile char *) addr + ((nr>>3)^3)))
4722 - : : "1", "cc", "memory" );
4725 + unsigned long addr;
4727 + addr = (unsigned long) ptr + ((nr ^ 24) >> 3);
4728 + asm volatile("oc 0(1,%1),0(%2)"
4729 + : "+m" (*(char *) addr)
4730 + : "a" (addr), "a" (_oi_bitmap + (nr & 7))
4735 +__constant_set_bit(const int nr, volatile void *ptr)
4737 + unsigned long addr;
4739 + addr = ((unsigned long) ptr) + ((nr >> 3) ^ 3);
4742 + asm volatile ("oi 0(%1),0x01"
4743 + : "+m" (*(char *) addr) : "a" (addr) : "cc" );
4746 + asm volatile ("oi 0(%1),0x02"
4747 + : "+m" (*(char *) addr) : "a" (addr) : "cc" );
4750 + asm volatile ("oi 0(%1),0x04"
4751 + : "+m" (*(char *) addr) : "a" (addr) : "cc" );
4754 + asm volatile ("oi 0(%1),0x08"
4755 + : "+m" (*(char *) addr) : "a" (addr) : "cc" );
4758 + asm volatile ("oi 0(%1),0x10"
4759 + : "+m" (*(char *) addr) : "a" (addr) : "cc" );
4762 + asm volatile ("oi 0(%1),0x20"
4763 + : "+m" (*(char *) addr) : "a" (addr) : "cc" );
4766 + asm volatile ("oi 0(%1),0x40"
4767 + : "+m" (*(char *) addr) : "a" (addr) : "cc" );
4770 + asm volatile ("oi 0(%1),0x80"
4771 + : "+m" (*(char *) addr) : "a" (addr) : "cc" );
4776 #define set_bit_simple(nr,addr) \
4777 @@ -323,76 +268,58 @@
4779 * fast, non-SMP clear_bit routine
4781 -static __inline__ void
4782 -__clear_bit(int nr, volatile void * addr)
4784 +__clear_bit(int nr, volatile void *ptr)
4786 - unsigned long reg1, reg2;
4787 - __asm__ __volatile__(
4793 - " la %1,0(%1,%3)\n"
4794 - " la %0,0(%0,%4)\n"
4795 - " nc 0(1,%1),0(%0)"
4796 - : "=&a" (reg1), "=&a" (reg2)
4797 - : "r" (nr), "a" (addr), "a" (&_ni_bitmap) : "cc", "memory" );
4800 -static __inline__ void
4801 -__constant_clear_bit(const int nr, volatile void * addr)
4805 - __asm__ __volatile__ ("la 1,%0\n\t"
4807 - : "=m" (*((volatile char *) addr + ((nr>>3)^3)))
4808 - : : "1", "cc", "memory" );
4811 - __asm__ __volatile__ ("la 1,%0\n\t"
4813 - : "=m" (*((volatile char *) addr + ((nr>>3)^3)))
4814 - : : "1", "cc", "memory" );
4817 - __asm__ __volatile__ ("la 1,%0\n\t"
4819 - : "=m" (*((volatile char *) addr + ((nr>>3)^3)))
4820 - : : "1", "cc", "memory" );
4823 - __asm__ __volatile__ ("la 1,%0\n\t"
4825 - : "=m" (*((volatile char *) addr + ((nr>>3)^3)))
4826 - : : "1", "cc", "memory" );
4829 - __asm__ __volatile__ ("la 1,%0\n\t"
4831 - : "=m" (*((volatile char *) addr + ((nr>>3)^3)))
4832 - : : "cc", "memory" );
4835 - __asm__ __volatile__ ("la 1,%0\n\t"
4837 - : "=m" (*((volatile char *) addr + ((nr>>3)^3)))
4838 - : : "1", "cc", "memory" );
4841 - __asm__ __volatile__ ("la 1,%0\n\t"
4843 - : "=m" (*((volatile char *) addr + ((nr>>3)^3)))
4844 - : : "1", "cc", "memory" );
4847 - __asm__ __volatile__ ("la 1,%0\n\t"
4849 - : "=m" (*((volatile char *) addr + ((nr>>3)^3)))
4850 - : : "1", "cc", "memory" );
4853 + unsigned long addr;
4855 + addr = (unsigned long) ptr + ((nr ^ 24) >> 3);
4856 + asm volatile("nc 0(1,%1),0(%2)"
4857 + : "+m" (*(char *) addr)
4858 + : "a" (addr), "a" (_ni_bitmap + (nr & 7))
4863 +__constant_clear_bit(const int nr, volatile void *ptr)
4865 + unsigned long addr;
4867 + addr = ((unsigned long) ptr) + ((nr >> 3) ^ 3);
4870 + asm volatile ("ni 0(%1),0xFE"
4871 + : "+m" (*(char *) addr) : "a" (addr) : "cc" );
4874 + asm volatile ("ni 0(%1),0xFD"
4875 + : "+m" (*(char *) addr) : "a" (addr) : "cc" );
4878 + asm volatile ("ni 0(%1),0xFB"
4879 + : "+m" (*(char *) addr) : "a" (addr) : "cc" );
4882 + asm volatile ("ni 0(%1),0xF7"
4883 + : "+m" (*(char *) addr) : "a" (addr) : "cc" );
4886 + asm volatile ("ni 0(%1),0xEF"
4887 + : "+m" (*(char *) addr) : "a" (addr) : "cc" );
4890 + asm volatile ("ni 0(%1),0xDF"
4891 + : "+m" (*(char *) addr) : "a" (addr) : "cc" );
4894 + asm volatile ("ni 0(%1),0xBF"
4895 + : "+m" (*(char *) addr) : "a" (addr) : "cc" );
4898 + asm volatile ("ni 0(%1),0x7F"
4899 + : "+m" (*(char *) addr) : "a" (addr) : "cc" );
4904 #define clear_bit_simple(nr,addr) \
4905 @@ -403,75 +330,57 @@
4907 * fast, non-SMP change_bit routine
4909 -static __inline__ void __change_bit(int nr, volatile void * addr)
4910 +static inline void __change_bit(int nr, volatile void *ptr)
4912 - unsigned long reg1, reg2;
4913 - __asm__ __volatile__(
4919 - " la %1,0(%1,%3)\n"
4920 - " la %0,0(%0,%4)\n"
4921 - " xc 0(1,%1),0(%0)"
4922 - : "=&a" (reg1), "=&a" (reg2)
4923 - : "r" (nr), "a" (addr), "a" (&_oi_bitmap) : "cc", "memory" );
4926 -static __inline__ void
4927 -__constant_change_bit(const int nr, volatile void * addr)
4931 - __asm__ __volatile__ ("la 1,%0\n\t"
4933 - : "=m" (*((volatile char *) addr + ((nr>>3)^3)))
4934 - : : "cc", "memory" );
4937 - __asm__ __volatile__ ("la 1,%0\n\t"
4939 - : "=m" (*((volatile char *) addr + ((nr>>3)^3)))
4940 - : : "cc", "memory" );
4943 - __asm__ __volatile__ ("la 1,%0\n\t"
4945 - : "=m" (*((volatile char *) addr + ((nr>>3)^3)))
4946 - : : "cc", "memory" );
4949 - __asm__ __volatile__ ("la 1,%0\n\t"
4951 - : "=m" (*((volatile char *) addr + ((nr>>3)^3)))
4952 - : : "cc", "memory" );
4955 - __asm__ __volatile__ ("la 1,%0\n\t"
4957 - : "=m" (*((volatile char *) addr + ((nr>>3)^3)))
4958 - : : "cc", "memory" );
4961 - __asm__ __volatile__ ("la 1,%0\n\t"
4963 - : "=m" (*((volatile char *) addr + ((nr>>3)^3)))
4964 - : : "1", "cc", "memory" );
4967 - __asm__ __volatile__ ("la 1,%0\n\t"
4969 - : "=m" (*((volatile char *) addr + ((nr>>3)^3)))
4970 - : : "1", "cc", "memory" );
4973 - __asm__ __volatile__ ("la 1,%0\n\t"
4975 - : "=m" (*((volatile char *) addr + ((nr>>3)^3)))
4976 - : : "1", "cc", "memory" );
4979 + unsigned long addr;
4981 + addr = (unsigned long) ptr + ((nr ^ 24) >> 3);
4982 + asm volatile("xc 0(1,%1),0(%2)"
4983 + : "+m" (*(char *) addr)
4984 + : "a" (addr), "a" (_oi_bitmap + (nr & 7))
4989 +__constant_change_bit(const int nr, volatile void *ptr)
4991 + unsigned long addr;
4993 + addr = ((unsigned long) ptr) + ((nr >> 3) ^ 3);
4996 + asm volatile ("xi 0(%1),0x01"
4997 + : "+m" (*(char *) addr) : "a" (addr) : "cc" );
5000 + asm volatile ("xi 0(%1),0x02"
5001 + : "+m" (*(char *) addr) : "a" (addr) : "cc" );
5004 + asm volatile ("xi 0(%1),0x04"
5005 + : "+m" (*(char *) addr) : "a" (addr) : "cc" );
5008 + asm volatile ("xi 0(%1),0x08"
5009 + : "+m" (*(char *) addr) : "a" (addr) : "cc" );
5012 + asm volatile ("xi 0(%1),0x10"
5013 + : "+m" (*(char *) addr) : "a" (addr) : "cc" );
5016 + asm volatile ("xi 0(%1),0x20"
5017 + : "+m" (*(char *) addr) : "a" (addr) : "cc" );
5020 + asm volatile ("xi 0(%1),0x40"
5021 + : "+m" (*(char *) addr) : "a" (addr) : "cc" );
5024 + asm volatile ("xi 0(%1),0x80"
5025 + : "+m" (*(char *) addr) : "a" (addr) : "cc" );
5030 #define change_bit_simple(nr,addr) \
5031 @@ -482,74 +391,54 @@
5033 * fast, non-SMP test_and_set_bit routine
5035 -static __inline__ int test_and_set_bit_simple(int nr, volatile void * addr)
5036 +static inline int test_and_set_bit_simple(int nr, volatile void *ptr)
5038 - unsigned long reg1, reg2;
5040 - __asm__ __volatile__(
5046 - " la %1,0(%1,%4)\n"
5049 - " la %2,0(%2,%5)\n"
5050 - " oc 0(1,%1),0(%2)"
5051 - : "=d&" (oldbit), "=&a" (reg1), "=&a" (reg2)
5052 - : "r" (nr), "a" (addr), "a" (&_oi_bitmap) : "cc", "memory" );
5053 - return oldbit & 1;
5054 + unsigned long addr;
5057 + addr = (unsigned long) ptr + ((nr ^ 24) >> 3);
5058 + ch = *(unsigned char *) addr;
5059 + asm volatile("oc 0(1,%1),0(%2)"
5060 + : "+m" (*(char *) addr)
5061 + : "a" (addr), "a" (_oi_bitmap + (nr & 7))
5063 + return (ch >> (nr & 7)) & 1;
5065 #define __test_and_set_bit(X,Y) test_and_set_bit_simple(X,Y)
5068 * fast, non-SMP test_and_clear_bit routine
5070 -static __inline__ int test_and_clear_bit_simple(int nr, volatile void * addr)
5071 +static inline int test_and_clear_bit_simple(int nr, volatile void *ptr)
5073 - unsigned long reg1, reg2;
5075 + unsigned long addr;
5078 - __asm__ __volatile__(
5084 - " la %1,0(%1,%4)\n"
5087 - " la %2,0(%2,%5)\n"
5088 - " nc 0(1,%1),0(%2)"
5089 - : "=d&" (oldbit), "=&a" (reg1), "=&a" (reg2)
5090 - : "r" (nr), "a" (addr), "a" (&_ni_bitmap) : "cc", "memory" );
5091 - return oldbit & 1;
5092 + addr = (unsigned long) ptr + ((nr ^ 24) >> 3);
5093 + ch = *(unsigned char *) addr;
5094 + asm volatile("nc 0(1,%1),0(%2)"
5095 + : "+m" (*(char *) addr)
5096 + : "a" (addr), "a" (_ni_bitmap + (nr & 7))
5098 + return (ch >> (nr & 7)) & 1;
5100 #define __test_and_clear_bit(X,Y) test_and_clear_bit_simple(X,Y)
5103 * fast, non-SMP test_and_change_bit routine
5105 -static __inline__ int test_and_change_bit_simple(int nr, volatile void * addr)
5106 +static inline int test_and_change_bit_simple(int nr, volatile void *ptr)
5108 - unsigned long reg1, reg2;
5110 + unsigned long addr;
5113 - __asm__ __volatile__(
5119 - " la %1,0(%1,%4)\n"
5122 - " la %2,0(%2,%5)\n"
5123 - " xc 0(1,%1),0(%2)"
5124 - : "=d&" (oldbit), "=&a" (reg1), "=&a" (reg2)
5125 - : "r" (nr), "a" (addr), "a" (&_oi_bitmap) : "cc", "memory" );
5126 - return oldbit & 1;
5127 + addr = (unsigned long) ptr + ((nr ^ 24) >> 3);
5128 + ch = *(unsigned char *) addr;
5129 + asm volatile("xc 0(1,%1),0(%2)"
5130 + : "+m" (*(char *) addr)
5131 + : "a" (addr), "a" (_oi_bitmap + (nr & 7))
5133 + return (ch >> (nr & 7)) & 1;
5135 #define __test_and_change_bit(X,Y) test_and_change_bit_simple(X,Y)
5137 @@ -574,25 +463,17 @@
5138 * This routine doesn't need to be atomic.
5141 -static __inline__ int __test_bit(int nr, volatile void * addr)
5142 +static inline int __test_bit(int nr, volatile void *ptr)
5144 - unsigned long reg1, reg2;
5146 + unsigned long addr;
5149 - __asm__ __volatile__(
5155 - " ic %0,0(%2,%4)\n"
5157 - : "=d&" (oldbit), "=&a" (reg1), "=&a" (reg2)
5158 - : "r" (nr), "a" (addr) : "cc" );
5159 - return oldbit & 1;
5160 + addr = (unsigned long) ptr + ((nr ^ 24) >> 3);
5161 + ch = *(unsigned char *) addr;
5162 + return (ch >> (nr & 7)) & 1;
5165 -static __inline__ int __constant_test_bit(int nr, volatile void * addr) {
5166 +static inline int __constant_test_bit(int nr, volatile void * addr) {
5167 return (((volatile char *) addr)[(nr>>3)^3] & (1<<(nr&7))) != 0;
5172 * Find-bit routines..
5174 -static __inline__ int find_first_zero_bit(void * addr, unsigned size)
5175 +static inline int find_first_zero_bit(void * addr, unsigned size)
5177 unsigned long cmp, count;
5179 @@ -642,7 +523,45 @@
5180 return (res < size) ? res : size;
5183 -static __inline__ int find_next_zero_bit (void * addr, int size, int offset)
5184 +static inline int find_first_bit(void * addr, unsigned size)
5186 + unsigned long cmp, count;
5191 + __asm__(" slr %1,%1\n"
5196 + "0: c %1,0(%0,%4)\n"
5202 + "1: l %2,0(%0,%4)\n"
5205 + " tml %2,0xffff\n"
5209 + "2: tml %2,0x00ff\n"
5214 + " ic %2,0(%2,%5)\n"
5217 + : "=&a" (res), "=&d" (cmp), "=&a" (count)
5218 + : "a" (size), "a" (addr), "a" (&_sb_findmap) : "cc" );
5219 + return (res < size) ? res : size;
5222 +static inline int find_next_zero_bit (void * addr, int size, int offset)
5224 unsigned long * p = ((unsigned long *) addr) + (offset >> 5);
5225 unsigned long bitvec, reg;
5226 @@ -680,11 +599,49 @@
5227 return (offset + res);
5230 +static inline int find_next_bit (void * addr, int size, int offset)
5232 + unsigned long * p = ((unsigned long *) addr) + (offset >> 5);
5233 + unsigned long bitvec, reg;
5234 + int set, bit = offset & 31, res;
5238 + * Look for set bit in first word
5240 + bitvec = (*p) >> bit;
5241 + __asm__(" slr %0,%0\n"
5243 + " tml %1,0xffff\n"
5247 + "0: tml %1,0x00ff\n"
5252 + " ic %1,0(%1,%3)\n"
5254 + : "=&d" (set), "+a" (bitvec), "=&d" (reg)
5255 + : "a" (&_sb_findmap) : "cc" );
5256 + if (set < (32 - bit))
5257 + return set + offset;
5258 + offset += 32 - bit;
5262 + * No set bit yet, search remaining full words for a bit
5264 + res = find_first_bit (p, size - 32 * (p - (unsigned long *) addr));
5265 + return (offset + res);
5269 * ffz = Find First Zero in word. Undefined if no zero exists,
5270 * so code should check against ~0UL first..
5272 -static __inline__ unsigned long ffz(unsigned long word)
5273 +static inline unsigned long ffz(unsigned long word)
5277 @@ -708,40 +665,109 @@
5281 + * __ffs = find first bit in word. Undefined if no bit exists,
5282 + * so code should check against 0UL first..
5284 +static inline unsigned long __ffs(unsigned long word)
5286 + unsigned long reg, result;
5288 + __asm__(" slr %0,%0\n"
5290 + " tml %1,0xffff\n"
5294 + "0: tml %1,0x00ff\n"
5299 + " ic %1,0(%1,%3)\n"
5301 + : "=&d" (result), "+a" (word), "=&d" (reg)
5302 + : "a" (&_sb_findmap) : "cc" );
5307 + * Every architecture must define this function. It's the fastest
5308 + * way of searching a 140-bit bitmap where the first 100 bits are
5309 + * unlikely to be set. It's guaranteed that at least one of the 140
5310 + * bits is cleared.
5312 +static inline int sched_find_first_bit(unsigned long *b)
5314 + return find_first_bit(b, 140);
5318 * ffs: find first bit set. This is defined the same way as
5319 * the libc and compiler builtin ffs routines, therefore
5320 * differs in spirit from the above ffz (man ffs).
5323 -extern int __inline__ ffs (int x)
5324 +extern int inline ffs (int x)
5331 - __asm__(" slr %0,%0\n"
5332 - " tml %1,0xffff\n"
5334 + __asm__(" tml %1,0xffff\n"
5339 "0: tml %1,0x00ff\n"
5344 "1: tml %1,0x000f\n"
5349 "2: tml %1,0x0003\n"
5354 "3: tml %1,0x0001\n"
5358 : "=&d" (r), "+d" (x) : : "cc" );
5364 + * fls: find last bit set.
5366 +extern __inline__ int fls(int x)
5372 + __asm__(" tmh %1,0xffff\n"
5376 + "0: tmh %1,0xff00\n"
5380 + "1: tmh %1,0xf000\n"
5384 + "2: tmh %1,0xc000\n"
5388 + "3: tmh %1,0x8000\n"
5392 + : "+d" (r), "+d" (x) : : "cc" );
5398 #define ext2_set_bit(nr, addr) test_and_set_bit((nr)^24, addr)
5399 #define ext2_clear_bit(nr, addr) test_and_clear_bit((nr)^24, addr)
5400 #define ext2_test_bit(nr, addr) test_bit((nr)^24, addr)
5401 -static __inline__ int ext2_find_first_zero_bit(void *vaddr, unsigned size)
5402 +static inline int ext2_find_first_zero_bit(void *vaddr, unsigned size)
5404 unsigned long cmp, count;
5407 return (res < size) ? res : size;
5410 -static __inline__ int
5412 ext2_find_next_zero_bit(void *vaddr, unsigned size, unsigned offset)
5414 unsigned long *addr = vaddr;
5415 diff -urN linux-2.4.20/include/asm-s390x/bitops.h linux-2.4.20-o1/include/asm-s390x/bitops.h
5416 --- linux-2.4.20/include/asm-s390x/bitops.h Sat Aug 3 02:39:45 2002
5417 +++ linux-2.4.20-o1/include/asm-s390x/bitops.h Wed Mar 12 00:41:43 2003
5418 @@ -51,271 +51,220 @@
5419 extern const char _oi_bitmap[];
5420 extern const char _ni_bitmap[];
5421 extern const char _zb_findmap[];
5422 +extern const char _sb_findmap[];
5426 * SMP save set_bit routine based on compare and swap (CS)
5428 -static __inline__ void set_bit_cs(unsigned long nr, volatile void * addr)
5429 +static inline void set_bit_cs(unsigned long nr, volatile void *ptr)
5431 - unsigned long bits, mask;
5432 - __asm__ __volatile__(
5433 + unsigned long addr, old, new, mask;
5435 + addr = (unsigned long) ptr;
5437 - " lghi %2,7\n" /* CS must be aligned on 4 byte b. */
5438 - " ngr %2,%1\n" /* isolate last 2 bits of address */
5439 - " xgr %1,%2\n" /* make addr % 4 == 0 */
5441 - " agr %0,%2\n" /* add alignement to bitnr */
5442 + addr ^= addr & 7; /* align address to 8 */
5443 + nr += (addr & 7) << 3; /* add alignment to bit number */
5446 - " nr %2,%0\n" /* make shift value */
5450 - " la %1,0(%0,%1)\n" /* calc. address for CS */
5451 - " sllg %3,%3,0(%2)\n" /* make OR mask */
5453 - "0: lgr %2,%0\n" /* CS loop starts here */
5454 - " ogr %2,%3\n" /* set bit */
5455 - " csg %0,%2,0(%1)\n"
5457 - : "+a" (nr), "+a" (addr), "=&a" (bits), "=&d" (mask) :
5458 - : "cc", "memory" );
5459 + addr += (nr ^ (nr & 63)) >> 3; /* calculate address for CS */
5460 + mask = 1UL << (nr & 63); /* make OR mask */
5465 + " csg %0,%1,0(%4)\n"
5467 + : "=&d" (old), "=&d" (new), "+m" (*(unsigned long *) addr)
5468 + : "d" (mask), "a" (addr)
5473 * SMP save clear_bit routine based on compare and swap (CS)
5475 -static __inline__ void clear_bit_cs(unsigned long nr, volatile void * addr)
5476 +static inline void clear_bit_cs(unsigned long nr, volatile void *ptr)
5478 - unsigned long bits, mask;
5479 - __asm__ __volatile__(
5480 + unsigned long addr, old, new, mask;
5482 + addr = (unsigned long) ptr;
5484 - " lghi %2,7\n" /* CS must be aligned on 4 byte b. */
5485 - " ngr %2,%1\n" /* isolate last 2 bits of address */
5486 - " xgr %1,%2\n" /* make addr % 4 == 0 */
5488 - " agr %0,%2\n" /* add alignement to bitnr */
5489 + addr ^= addr & 7; /* align address to 8 */
5490 + nr += (addr & 7) << 3; /* add alignment to bit number */
5493 - " nr %2,%0\n" /* make shift value */
5497 - " la %1,0(%0,%1)\n" /* calc. address for CS */
5499 - " rllg %3,%3,0(%2)\n" /* make AND mask */
5501 - "0: lgr %2,%0\n" /* CS loop starts here */
5502 - " ngr %2,%3\n" /* clear bit */
5503 - " csg %0,%2,0(%1)\n"
5505 - : "+a" (nr), "+a" (addr), "=&a" (bits), "=&d" (mask) :
5506 - : "cc", "memory" );
5507 + addr += (nr ^ (nr & 63)) >> 3; /* calculate address for CS */
5508 + mask = ~(1UL << (nr & 63)); /* make AND mask */
5513 + " csg %0,%1,0(%4)\n"
5515 + : "=&d" (old), "=&d" (new), "+m" (*(unsigned long *) addr)
5516 + : "d" (mask), "a" (addr)
5521 * SMP save change_bit routine based on compare and swap (CS)
5523 -static __inline__ void change_bit_cs(unsigned long nr, volatile void * addr)
5524 +static inline void change_bit_cs(unsigned long nr, volatile void *ptr)
5526 - unsigned long bits, mask;
5527 - __asm__ __volatile__(
5528 + unsigned long addr, old, new, mask;
5530 + addr = (unsigned long) ptr;
5532 - " lghi %2,7\n" /* CS must be aligned on 4 byte b. */
5533 - " ngr %2,%1\n" /* isolate last 2 bits of address */
5534 - " xgr %1,%2\n" /* make addr % 4 == 0 */
5536 - " agr %0,%2\n" /* add alignement to bitnr */
5537 + addr ^= addr & 7; /* align address to 8 */
5538 + nr += (addr & 7) << 3; /* add alignment to bit number */
5541 - " nr %2,%0\n" /* make shift value */
5545 - " la %1,0(%0,%1)\n" /* calc. address for CS */
5546 - " sllg %3,%3,0(%2)\n" /* make XR mask */
5548 - "0: lgr %2,%0\n" /* CS loop starts here */
5549 - " xgr %2,%3\n" /* change bit */
5550 - " csg %0,%2,0(%1)\n"
5552 - : "+a" (nr), "+a" (addr), "=&a" (bits), "=&d" (mask) :
5553 - : "cc", "memory" );
5554 + addr += (nr ^ (nr & 63)) >> 3; /* calculate address for CS */
5555 + mask = 1UL << (nr & 63); /* make XOR mask */
5560 + " csg %0,%1,0(%4)\n"
5562 + : "=&d" (old), "=&d" (new), "+m" (*(unsigned long *) addr)
5563 + : "d" (mask), "a" (addr)
5568 * SMP save test_and_set_bit routine based on compare and swap (CS)
5570 -static __inline__ int
5571 -test_and_set_bit_cs(unsigned long nr, volatile void * addr)
5573 +test_and_set_bit_cs(unsigned long nr, volatile void *ptr)
5575 - unsigned long bits, mask;
5576 - __asm__ __volatile__(
5577 + unsigned long addr, old, new, mask;
5579 + addr = (unsigned long) ptr;
5581 - " lghi %2,7\n" /* CS must be aligned on 4 byte b. */
5582 - " ngr %2,%1\n" /* isolate last 2 bits of address */
5583 - " xgr %1,%2\n" /* make addr % 4 == 0 */
5585 - " agr %0,%2\n" /* add alignement to bitnr */
5586 + addr ^= addr & 7; /* align address to 8 */
5587 + nr += (addr & 7) << 3; /* add alignment to bit number */
5590 - " nr %2,%0\n" /* make shift value */
5594 - " la %1,0(%0,%1)\n" /* calc. address for CS */
5595 - " sllg %3,%3,0(%2)\n" /* make OR mask */
5597 - "0: lgr %2,%0\n" /* CS loop starts here */
5598 - " ogr %2,%3\n" /* set bit */
5599 - " csg %0,%2,0(%1)\n"
5601 - " ngr %0,%3\n" /* isolate old bit */
5602 - : "+a" (nr), "+a" (addr), "=&a" (bits), "=&d" (mask) :
5603 - : "cc", "memory" );
5605 + addr += (nr ^ (nr & 63)) >> 3; /* calculate address for CS */
5606 + mask = 1UL << (nr & 63); /* make OR/test mask */
5611 + " csg %0,%1,0(%4)\n"
5613 + : "=&d" (old), "=&d" (new), "+m" (*(unsigned long *) addr)
5614 + : "d" (mask), "a" (addr)
5616 + return (old & mask) != 0;
5620 * SMP save test_and_clear_bit routine based on compare and swap (CS)
5622 -static __inline__ int
5623 -test_and_clear_bit_cs(unsigned long nr, volatile void * addr)
5625 +test_and_clear_bit_cs(unsigned long nr, volatile void *ptr)
5627 - unsigned long bits, mask;
5628 - __asm__ __volatile__(
5629 + unsigned long addr, old, new, mask;
5631 + addr = (unsigned long) ptr;
5633 - " lghi %2,7\n" /* CS must be aligned on 4 byte b. */
5634 - " ngr %2,%1\n" /* isolate last 2 bits of address */
5635 - " xgr %1,%2\n" /* make addr % 4 == 0 */
5637 - " agr %0,%2\n" /* add alignement to bitnr */
5638 + addr ^= addr & 7; /* align address to 8 */
5639 + nr += (addr & 7) << 3; /* add alignment to bit number */
5642 - " nr %2,%0\n" /* make shift value */
5646 - " la %1,0(%0,%1)\n" /* calc. address for CS */
5647 - " rllg %3,%3,0(%2)\n" /* make AND mask */
5649 - "0: lgr %2,%0\n" /* CS loop starts here */
5650 - " ngr %2,%3\n" /* clear bit */
5651 - " csg %0,%2,0(%1)\n"
5653 - " xgr %0,%2\n" /* isolate old bit */
5654 - : "+a" (nr), "+a" (addr), "=&a" (bits), "=&d" (mask) :
5655 - : "cc", "memory" );
5657 + addr += (nr ^ (nr & 63)) >> 3; /* calculate address for CS */
5658 + mask = ~(1UL << (nr & 63)); /* make AND mask */
5663 + " csg %0,%1,0(%4)\n"
5665 + : "=&d" (old), "=&d" (new), "+m" (*(unsigned long *) addr)
5666 + : "d" (mask), "a" (addr)
5668 + return (old ^ new) != 0;
5672 * SMP save test_and_change_bit routine based on compare and swap (CS)
5674 -static __inline__ int
5675 -test_and_change_bit_cs(unsigned long nr, volatile void * addr)
5677 +test_and_change_bit_cs(unsigned long nr, volatile void *ptr)
5679 - unsigned long bits, mask;
5680 - __asm__ __volatile__(
5681 + unsigned long addr, old, new, mask;
5683 + addr = (unsigned long) ptr;
5685 - " lghi %2,7\n" /* CS must be aligned on 4 byte b. */
5686 - " ngr %2,%1\n" /* isolate last 2 bits of address */
5687 - " xgr %1,%2\n" /* make addr % 4 == 0 */
5689 - " agr %0,%2\n" /* add alignement to bitnr */
5690 + addr ^= addr & 7; /* align address to 8 */
5691 + nr += (addr & 7) << 3; /* add alignment to bit number */
5694 - " nr %2,%0\n" /* make shift value */
5698 - " la %1,0(%0,%1)\n" /* calc. address for CS */
5699 - " sllg %3,%3,0(%2)\n" /* make OR mask */
5701 - "0: lgr %2,%0\n" /* CS loop starts here */
5702 - " xgr %2,%3\n" /* change bit */
5703 - " csg %0,%2,0(%1)\n"
5705 - " ngr %0,%3\n" /* isolate old bit */
5706 - : "+a" (nr), "+a" (addr), "=&a" (bits), "=&d" (mask) :
5707 - : "cc", "memory" );
5709 + addr += (nr ^ (nr & 63)) >> 3; /* calculate address for CS */
5710 + mask = 1UL << (nr & 63); /* make XOR mask */
5715 + " csg %0,%1,0(%4)\n"
5717 + : "=&d" (old), "=&d" (new), "+m" (*(unsigned long *) addr)
5718 + : "d" (mask), "a" (addr)
5720 + return (old & mask) != 0;
5722 #endif /* CONFIG_SMP */
5725 * fast, non-SMP set_bit routine
5727 -static __inline__ void __set_bit(unsigned long nr, volatile void * addr)
5728 +static inline void __set_bit(unsigned long nr, volatile void *ptr)
5730 - unsigned long reg1, reg2;
5731 - __asm__ __volatile__(
5737 - " la %1,0(%1,%3)\n"
5738 - " la %0,0(%0,%4)\n"
5739 - " oc 0(1,%1),0(%0)"
5740 - : "=&a" (reg1), "=&a" (reg2)
5741 - : "a" (nr), "a" (addr), "a" (&_oi_bitmap) : "cc", "memory" );
5744 -static __inline__ void
5745 -__constant_set_bit(const unsigned long nr, volatile void * addr)
5749 - __asm__ __volatile__ ("la 1,%0\n\t"
5751 - : "=m" (*((volatile char *) addr + ((nr>>3)^7)))
5752 - : : "1", "cc", "memory");
5755 - __asm__ __volatile__ ("la 1,%0\n\t"
5757 - : "=m" (*((volatile char *) addr + ((nr>>3)^7)))
5758 - : : "1", "cc", "memory" );
5761 - __asm__ __volatile__ ("la 1,%0\n\t"
5763 - : "=m" (*((volatile char *) addr + ((nr>>3)^7)))
5764 - : : "1", "cc", "memory" );
5767 - __asm__ __volatile__ ("la 1,%0\n\t"
5769 - : "=m" (*((volatile char *) addr + ((nr>>3)^7)))
5770 - : : "1", "cc", "memory" );
5773 - __asm__ __volatile__ ("la 1,%0\n\t"
5775 - : "=m" (*((volatile char *) addr + ((nr>>3)^7)))
5776 - : : "1", "cc", "memory" );
5779 - __asm__ __volatile__ ("la 1,%0\n\t"
5781 - : "=m" (*((volatile char *) addr + ((nr>>3)^7)))
5782 - : : "1", "cc", "memory" );
5785 - __asm__ __volatile__ ("la 1,%0\n\t"
5787 - : "=m" (*((volatile char *) addr + ((nr>>3)^7)))
5788 - : : "1", "cc", "memory" );
5791 - __asm__ __volatile__ ("la 1,%0\n\t"
5793 - : "=m" (*((volatile char *) addr + ((nr>>3)^7)))
5794 - : : "1", "cc", "memory" );
5797 + unsigned long addr;
5799 + addr = (unsigned long) ptr + ((nr ^ 56) >> 3);
5800 + asm volatile("oc 0(1,%1),0(%2)"
5801 + : "+m" (*(char *) addr)
5802 + : "a" (addr), "a" (_oi_bitmap + (nr & 7))
5807 +__constant_set_bit(const unsigned long nr, volatile void *ptr)
5809 + unsigned long addr;
5811 + addr = ((unsigned long) ptr) + ((nr >> 3) ^ 7);
5814 + asm volatile ("oi 0(%1),0x01"
5815 + : "+m" (*(char *) addr) : "a" (addr) : "cc" );
5818 + asm volatile ("oi 0(%1),0x02"
5819 + : "+m" (*(char *) addr) : "a" (addr) : "cc" );
5822 + asm volatile ("oi 0(%1),0x04"
5823 + : "+m" (*(char *) addr) : "a" (addr) : "cc" );
5826 + asm volatile ("oi 0(%1),0x08"
5827 + : "+m" (*(char *) addr) : "a" (addr) : "cc" );
5830 + asm volatile ("oi 0(%1),0x10"
5831 + : "+m" (*(char *) addr) : "a" (addr) : "cc" );
5834 + asm volatile ("oi 0(%1),0x20"
5835 + : "+m" (*(char *) addr) : "a" (addr) : "cc" );
5838 + asm volatile ("oi 0(%1),0x40"
5839 + : "+m" (*(char *) addr) : "a" (addr) : "cc" );
5842 + asm volatile ("oi 0(%1),0x80"
5843 + : "+m" (*(char *) addr) : "a" (addr) : "cc" );
5848 #define set_bit_simple(nr,addr) \
5849 @@ -326,76 +275,58 @@
5851 * fast, non-SMP clear_bit routine
5853 -static __inline__ void
5854 -__clear_bit(unsigned long nr, volatile void * addr)
5856 +__clear_bit(unsigned long nr, volatile void *ptr)
5858 - unsigned long reg1, reg2;
5859 - __asm__ __volatile__(
5865 - " la %1,0(%1,%3)\n"
5866 - " la %0,0(%0,%4)\n"
5867 - " nc 0(1,%1),0(%0)"
5868 - : "=&a" (reg1), "=&a" (reg2)
5869 - : "d" (nr), "a" (addr), "a" (&_ni_bitmap) : "cc", "memory" );
5872 -static __inline__ void
5873 -__constant_clear_bit(const unsigned long nr, volatile void * addr)
5877 - __asm__ __volatile__ ("la 1,%0\n\t"
5879 - : "=m" (*((volatile char *) addr + ((nr>>3)^7)))
5880 - : : "1", "cc", "memory" );
5883 - __asm__ __volatile__ ("la 1,%0\n\t"
5885 - : "=m" (*((volatile char *) addr + ((nr>>3)^7)))
5886 - : : "1", "cc", "memory" );
5889 - __asm__ __volatile__ ("la 1,%0\n\t"
5891 - : "=m" (*((volatile char *) addr + ((nr>>3)^7)))
5892 - : : "1", "cc", "memory" );
5895 - __asm__ __volatile__ ("la 1,%0\n\t"
5897 - : "=m" (*((volatile char *) addr + ((nr>>3)^7)))
5898 - : : "1", "cc", "memory" );
5901 - __asm__ __volatile__ ("la 1,%0\n\t"
5903 - : "=m" (*((volatile char *) addr + ((nr>>3)^7)))
5904 - : : "cc", "memory" );
5907 - __asm__ __volatile__ ("la 1,%0\n\t"
5909 - : "=m" (*((volatile char *) addr + ((nr>>3)^7)))
5910 - : : "1", "cc", "memory" );
5913 - __asm__ __volatile__ ("la 1,%0\n\t"
5915 - : "=m" (*((volatile char *) addr + ((nr>>3)^7)))
5916 - : : "1", "cc", "memory" );
5919 - __asm__ __volatile__ ("la 1,%0\n\t"
5921 - : "=m" (*((volatile char *) addr + ((nr>>3)^7)))
5922 - : : "1", "cc", "memory" );
5925 + unsigned long addr;
5927 + addr = (unsigned long) ptr + ((nr ^ 56) >> 3);
5928 + asm volatile("nc 0(1,%1),0(%2)"
5929 + : "+m" (*(char *) addr)
5930 + : "a" (addr), "a" (_ni_bitmap + (nr & 7))
5935 +__constant_clear_bit(const unsigned long nr, volatile void *ptr)
5937 + unsigned long addr;
5939 + addr = ((unsigned long) ptr) + ((nr >> 3) ^ 7);
5942 + asm volatile ("ni 0(%1),0xFE"
5943 + : "+m" (*(char *) addr) : "a" (addr) : "cc" );
5946 + asm volatile ("ni 0(%1),0xFD"
5947 + : "+m" (*(char *) addr) : "a" (addr) : "cc" );
5950 + asm volatile ("ni 0(%1),0xFB"
5951 + : "+m" (*(char *) addr) : "a" (addr) : "cc" );
5954 + asm volatile ("ni 0(%1),0xF7"
5955 + : "+m" (*(char *) addr) : "a" (addr) : "cc" );
5958 + asm volatile ("ni 0(%1),0xEF"
5959 + : "+m" (*(char *) addr) : "a" (addr) : "cc" );
5962 + asm volatile ("ni 0(%1),0xDF"
5963 + : "+m" (*(char *) addr) : "a" (addr) : "cc" );
5966 + asm volatile ("ni 0(%1),0xBF"
5967 + : "+m" (*(char *) addr) : "a" (addr) : "cc" );
5970 + asm volatile ("ni 0(%1),0x7F"
5971 + : "+m" (*(char *) addr) : "a" (addr) : "cc" );
5976 #define clear_bit_simple(nr,addr) \
5977 @@ -406,75 +337,57 @@
5979 * fast, non-SMP change_bit routine
5981 -static __inline__ void __change_bit(unsigned long nr, volatile void * addr)
5982 +static inline void __change_bit(unsigned long nr, volatile void *ptr)
5984 - unsigned long reg1, reg2;
5985 - __asm__ __volatile__(
5991 - " la %1,0(%1,%3)\n"
5992 - " la %0,0(%0,%4)\n"
5993 - " xc 0(1,%1),0(%0)"
5994 - : "=&a" (reg1), "=&a" (reg2)
5995 - : "d" (nr), "a" (addr), "a" (&_oi_bitmap) : "cc", "memory" );
5998 -static __inline__ void
5999 -__constant_change_bit(const unsigned long nr, volatile void * addr)
6003 - __asm__ __volatile__ ("la 1,%0\n\t"
6005 - : "=m" (*((volatile char *) addr + ((nr>>3)^7)))
6006 - : : "cc", "memory" );
6009 - __asm__ __volatile__ ("la 1,%0\n\t"
6011 - : "=m" (*((volatile char *) addr + ((nr>>3)^7)))
6012 - : : "cc", "memory" );
6015 - __asm__ __volatile__ ("la 1,%0\n\t"
6017 - : "=m" (*((volatile char *) addr + ((nr>>3)^7)))
6018 - : : "cc", "memory" );
6021 - __asm__ __volatile__ ("la 1,%0\n\t"
6023 - : "=m" (*((volatile char *) addr + ((nr>>3)^7)))
6024 - : : "cc", "memory" );
6027 - __asm__ __volatile__ ("la 1,%0\n\t"
6029 - : "=m" (*((volatile char *) addr + ((nr>>3)^7)))
6030 - : : "cc", "memory" );
6033 - __asm__ __volatile__ ("la 1,%0\n\t"
6035 - : "=m" (*((volatile char *) addr + ((nr>>3)^7)))
6036 - : : "1", "cc", "memory" );
6039 - __asm__ __volatile__ ("la 1,%0\n\t"
6041 - : "=m" (*((volatile char *) addr + ((nr>>3)^7)))
6042 - : : "1", "cc", "memory" );
6045 - __asm__ __volatile__ ("la 1,%0\n\t"
6047 - : "=m" (*((volatile char *) addr + ((nr>>3)^7)))
6048 - : : "1", "cc", "memory" );
6051 + unsigned long addr;
6053 + addr = (unsigned long) ptr + ((nr ^ 56) >> 3);
6054 + asm volatile("xc 0(1,%1),0(%2)"
6055 + : "+m" (*(char *) addr)
6056 + : "a" (addr), "a" (_oi_bitmap + (nr & 7))
6061 +__constant_change_bit(const unsigned long nr, volatile void *ptr)
6063 + unsigned long addr;
6065 + addr = ((unsigned long) ptr) + ((nr >> 3) ^ 7);
6068 + asm volatile ("xi 0(%1),0x01"
6069 + : "+m" (*(char *) addr) : "a" (addr) : "cc" );
6072 + asm volatile ("xi 0(%1),0x02"
6073 + : "+m" (*(char *) addr) : "a" (addr) : "cc" );
6076 + asm volatile ("xi 0(%1),0x04"
6077 + : "+m" (*(char *) addr) : "a" (addr) : "cc" );
6080 + asm volatile ("xi 0(%1),0x08"
6081 + : "+m" (*(char *) addr) : "a" (addr) : "cc" );
6084 + asm volatile ("xi 0(%1),0x10"
6085 + : "+m" (*(char *) addr) : "a" (addr) : "cc" );
6088 + asm volatile ("xi 0(%1),0x20"
6089 + : "+m" (*(char *) addr) : "a" (addr) : "cc" );
6092 + asm volatile ("xi 0(%1),0x40"
6093 + : "+m" (*(char *) addr) : "a" (addr) : "cc" );
6096 + asm volatile ("xi 0(%1),0x80"
6097 + : "+m" (*(char *) addr) : "a" (addr) : "cc" );
6102 #define change_bit_simple(nr,addr) \
6103 @@ -485,77 +398,57 @@
6105 * fast, non-SMP test_and_set_bit routine
6107 -static __inline__ int
6108 -test_and_set_bit_simple(unsigned long nr, volatile void * addr)
6110 +test_and_set_bit_simple(unsigned long nr, volatile void *ptr)
6112 - unsigned long reg1, reg2;
6114 - __asm__ __volatile__(
6120 - " la %1,0(%1,%4)\n"
6123 - " la %2,0(%2,%5)\n"
6124 - " oc 0(1,%1),0(%2)"
6125 - : "=&d" (oldbit), "=&a" (reg1), "=&a" (reg2)
6126 - : "d" (nr), "a" (addr), "a" (&_oi_bitmap) : "cc", "memory" );
6127 - return oldbit & 1;
6128 + unsigned long addr;
6131 + addr = (unsigned long) ptr + ((nr ^ 56) >> 3);
6132 + ch = *(unsigned char *) addr;
6133 + asm volatile("oc 0(1,%1),0(%2)"
6134 + : "+m" (*(char *) addr)
6135 + : "a" (addr), "a" (_oi_bitmap + (nr & 7))
6137 + return (ch >> (nr & 7)) & 1;
6139 #define __test_and_set_bit(X,Y) test_and_set_bit_simple(X,Y)
6142 * fast, non-SMP test_and_clear_bit routine
6144 -static __inline__ int
6145 -test_and_clear_bit_simple(unsigned long nr, volatile void * addr)
6147 +test_and_clear_bit_simple(unsigned long nr, volatile void *ptr)
6149 - unsigned long reg1, reg2;
6151 + unsigned long addr;
6154 - __asm__ __volatile__(
6160 - " la %1,0(%1,%4)\n"
6163 - " la %2,0(%2,%5)\n"
6164 - " nc 0(1,%1),0(%2)"
6165 - : "=&d" (oldbit), "=&a" (reg1), "=&a" (reg2)
6166 - : "d" (nr), "a" (addr), "a" (&_ni_bitmap) : "cc", "memory" );
6167 - return oldbit & 1;
6168 + addr = (unsigned long) ptr + ((nr ^ 56) >> 3);
6169 + ch = *(unsigned char *) addr;
6170 + asm volatile("nc 0(1,%1),0(%2)"
6171 + : "+m" (*(char *) addr)
6172 + : "a" (addr), "a" (_ni_bitmap + (nr & 7))
6174 + return (ch >> (nr & 7)) & 1;
6176 #define __test_and_clear_bit(X,Y) test_and_clear_bit_simple(X,Y)
6179 * fast, non-SMP test_and_change_bit routine
6181 -static __inline__ int
6182 -test_and_change_bit_simple(unsigned long nr, volatile void * addr)
6184 +test_and_change_bit_simple(unsigned long nr, volatile void *ptr)
6186 - unsigned long reg1, reg2;
6188 + unsigned long addr;
6191 - __asm__ __volatile__(
6197 - " la %1,0(%1,%4)\n"
6200 - " la %2,0(%2,%5)\n"
6201 - " xc 0(1,%1),0(%2)"
6202 - : "=&d" (oldbit), "=&a" (reg1), "=&a" (reg2)
6203 - : "d" (nr), "a" (addr), "a" (&_oi_bitmap) : "cc", "memory" );
6204 - return oldbit & 1;
6205 + addr = (unsigned long) ptr + ((nr ^ 56) >> 3);
6206 + ch = *(unsigned char *) addr;
6207 + asm volatile("xc 0(1,%1),0(%2)"
6208 + : "+m" (*(char *) addr)
6209 + : "a" (addr), "a" (_oi_bitmap + (nr & 7))
6211 + return (ch >> (nr & 7)) & 1;
6213 #define __test_and_change_bit(X,Y) test_and_change_bit_simple(X,Y)
6215 @@ -580,26 +473,18 @@
6216 * This routine doesn't need to be atomic.
6219 -static __inline__ int __test_bit(unsigned long nr, volatile void * addr)
6220 +static inline int __test_bit(unsigned long nr, volatile void *ptr)
6222 - unsigned long reg1, reg2;
6224 + unsigned long addr;
6227 - __asm__ __volatile__(
6233 - " ic %0,0(%2,%4)\n"
6235 - : "=&d" (oldbit), "=&a" (reg1), "=&a" (reg2)
6236 - : "d" (nr), "a" (addr) : "cc" );
6237 - return oldbit & 1;
6238 + addr = (unsigned long) ptr + ((nr ^ 56) >> 3);
6239 + ch = *(unsigned char *) addr;
6240 + return (ch >> (nr & 7)) & 1;
6243 -static __inline__ int
6244 -__constant_test_bit(unsigned long nr, volatile void * addr) {
6246 +__constant_test_bit(unsigned long nr, volatile void *addr) {
6247 return (((volatile char *) addr)[(nr>>3)^7] & (1<<(nr&7))) != 0;
6252 * Find-bit routines..
6254 -static __inline__ unsigned long
6255 +static inline unsigned long
6256 find_first_zero_bit(void * addr, unsigned long size)
6258 unsigned long res, cmp, count;
6259 @@ -653,7 +538,49 @@
6260 return (res < size) ? res : size;
6263 -static __inline__ unsigned long
6264 +static inline unsigned long
6265 +find_first_bit(void * addr, unsigned long size)
6267 + unsigned long res, cmp, count;
6271 + __asm__(" slgr %1,%1\n"
6276 + "0: cg %1,0(%0,%4)\n"
6282 + "1: lg %2,0(%0,%4)\n"
6287 + " srlg %2,%2,32\n"
6288 + "2: lghi %1,0xff\n"
6289 + " tmll %2,0xffff\n"
6293 + "3: tmll %2,0x00ff\n"
6298 + " ic %2,0(%2,%5)\n"
6301 + : "=&a" (res), "=&d" (cmp), "=&a" (count)
6302 + : "a" (size), "a" (addr), "a" (&_sb_findmap) : "cc" );
6303 + return (res < size) ? res : size;
6306 +static inline unsigned long
6307 find_next_zero_bit (void * addr, unsigned long size, unsigned long offset)
6309 unsigned long * p = ((unsigned long *) addr) + (offset >> 6);
6310 @@ -697,14 +624,56 @@
6311 return (offset + res);
6314 +static inline unsigned long
6315 +find_next_bit (void * addr, unsigned long size, unsigned long offset)
6317 + unsigned long * p = ((unsigned long *) addr) + (offset >> 6);
6318 + unsigned long bitvec, reg;
6319 + unsigned long set, bit = offset & 63, res;
6323 + * Look for zero in first word
6325 + bitvec = (*p) >> bit;
6326 + __asm__(" slgr %0,%0\n"
6330 + " srlg %1,%1,32\n"
6331 + "0: lghi %2,0xff\n"
6332 + " tmll %1,0xffff\n"
6335 + " srlg %1,%1,16\n"
6336 + "1: tmll %1,0x00ff\n"
6341 + " ic %1,0(%1,%3)\n"
6343 + : "=&d" (set), "+a" (bitvec), "=&d" (reg)
6344 + : "a" (&_sb_findmap) : "cc" );
6345 + if (set < (64 - bit))
6346 + return set + offset;
6347 + offset += 64 - bit;
6351 + * No set bit yet, search remaining full words for a bit
6353 + res = find_first_bit (p, size - 64 * (p - (unsigned long *) addr));
6354 + return (offset + res);
6358 * ffz = Find First Zero in word. Undefined if no zero exists,
6359 * so code should check against ~0UL first..
6361 -static __inline__ unsigned long ffz(unsigned long word)
6362 +static inline unsigned long ffz(unsigned long word)
6364 - unsigned long reg;
6366 + unsigned long reg, result;
6368 __asm__(" lhi %2,-1\n"
6370 @@ -730,40 +699,112 @@
6374 + * __ffs = find first bit in word. Undefined if no bit exists,
6375 + * so code should check against 0UL first..
6377 +static inline unsigned long __ffs (unsigned long word)
6379 + unsigned long reg, result;
6381 + __asm__(" slgr %0,%0\n"
6385 + " srlg %1,%1,32\n"
6386 + "0: lghi %2,0xff\n"
6387 + " tmll %1,0xffff\n"
6390 + " srlg %1,%1,16\n"
6391 + "1: tmll %1,0x00ff\n"
6396 + " ic %1,0(%1,%3)\n"
6398 + : "=&d" (result), "+a" (word), "=&d" (reg)
6399 + : "a" (&_sb_findmap) : "cc" );
6404 + * Every architecture must define this function. It's the fastest
6405 + * way of searching a 140-bit bitmap where the first 100 bits are
6406 + * unlikely to be set. It's guaranteed that at least one of the 140
6407 + * bits is cleared.
6409 +static inline int sched_find_first_bit(unsigned long *b)
6411 + return find_first_bit(b, 140);
6415 * ffs: find first bit set. This is defined the same way as
6416 * the libc and compiler builtin ffs routines, therefore
6417 * differs in spirit from the above ffz (man ffs).
6420 -extern int __inline__ ffs (int x)
6421 +extern int inline ffs (int x)
6428 - __asm__(" slr %0,%0\n"
6429 - " tml %1,0xffff\n"
6431 + __asm__(" tml %1,0xffff\n"
6436 "0: tml %1,0x00ff\n"
6441 "1: tml %1,0x000f\n"
6446 "2: tml %1,0x0003\n"
6451 "3: tml %1,0x0001\n"
6455 : "=&d" (r), "+d" (x) : : "cc" );
6461 + * fls: find last bit set.
6463 +extern __inline__ int fls(int x)
6469 + __asm__(" tmh %1,0xffff\n"
6473 + "0: tmh %1,0xff00\n"
6477 + "1: tmh %1,0xf000\n"
6481 + "2: tmh %1,0xc000\n"
6485 + "3: tmh %1,0x8000\n"
6489 + : "+d" (r), "+d" (x) : : "cc" );
6495 #define ext2_set_bit(nr, addr) test_and_set_bit((nr)^56, addr)
6496 #define ext2_clear_bit(nr, addr) test_and_clear_bit((nr)^56, addr)
6497 #define ext2_test_bit(nr, addr) test_bit((nr)^56, addr)
6498 -static __inline__ unsigned long
6499 +static inline unsigned long
6500 ext2_find_first_zero_bit(void *vaddr, unsigned long size)
6502 unsigned long res, cmp, count;
6504 return (res < size) ? res : size;
6507 -static __inline__ unsigned long
6508 +static inline unsigned long
6509 ext2_find_next_zero_bit(void *vaddr, unsigned long size, unsigned long offset)
6511 unsigned long *addr = vaddr;
6512 diff -urN linux-2.4.20/include/asm-sparc/bitops.h linux-2.4.20-o1/include/asm-sparc/bitops.h
6513 --- linux-2.4.20/include/asm-sparc/bitops.h Fri Dec 21 18:42:03 2001
6514 +++ linux-2.4.20-o1/include/asm-sparc/bitops.h Wed Mar 12 00:44:05 2003
6515 @@ -207,6 +207,57 @@
6520 + * __ffs - find first bit in word.
6521 + * @word: The word to search
6523 + * Undefined if no bit exists, so code should check against 0 first.
6525 +static __inline__ int __ffs(unsigned long word)
6529 + if ((word & 0xffff) == 0) {
6533 + if ((word & 0xff) == 0) {
6537 + if ((word & 0xf) == 0) {
6541 + if ((word & 0x3) == 0) {
6545 + if ((word & 0x1) == 0)
6551 + * Every architecture must define this function. It's the fastest
6552 + * way of searching a 140-bit bitmap where the first 100 bits are
6553 + * unlikely to be set. It's guaranteed that at least one of the 140
6554 + * bits is cleared.
6556 +static __inline__ int sched_find_first_bit(unsigned long *b)
6559 + if (unlikely(b[0]))
6560 + return __ffs(b[0]);
6561 + if (unlikely(b[1]))
6562 + return __ffs(b[1]) + 32;
6563 + if (unlikely(b[2]))
6564 + return __ffs(b[2]) + 64;
6566 + return __ffs(b[3]) + 96;
6567 + return __ffs(b[4]) + 128;
6571 * ffs: find first bit set. This is defined the same way as
6572 * the libc and compiler builtin ffs routines, therefore
6573 diff -urN linux-2.4.20/include/asm-sparc64/bitops.h linux-2.4.20-o1/include/asm-sparc64/bitops.h
6574 --- linux-2.4.20/include/asm-sparc64/bitops.h Fri Dec 21 18:42:03 2001
6575 +++ linux-2.4.20-o1/include/asm-sparc64/bitops.h Wed Mar 12 00:41:43 2003
6579 * bitops.h: Bit string operations on the V9.
6581 * Copyright 1996, 1997 David S. Miller (davem@caip.rutgers.edu)
6583 #ifndef _SPARC64_BITOPS_H
6584 #define _SPARC64_BITOPS_H
6586 +#include <linux/compiler.h>
6587 #include <asm/byteorder.h>
6589 -extern long ___test_and_set_bit(unsigned long nr, volatile void *addr);
6590 -extern long ___test_and_clear_bit(unsigned long nr, volatile void *addr);
6591 -extern long ___test_and_change_bit(unsigned long nr, volatile void *addr);
6592 +extern long ___test_and_set_bit(unsigned long nr, volatile unsigned long *addr);
6593 +extern long ___test_and_clear_bit(unsigned long nr, volatile unsigned long *addr);
6594 +extern long ___test_and_change_bit(unsigned long nr, volatile unsigned long *addr);
6596 #define test_and_set_bit(nr,addr) ({___test_and_set_bit(nr,addr)!=0;})
6597 #define test_and_clear_bit(nr,addr) ({___test_and_clear_bit(nr,addr)!=0;})
6598 @@ -21,109 +22,132 @@
6599 #define change_bit(nr,addr) ((void)___test_and_change_bit(nr,addr))
6601 /* "non-atomic" versions... */
6602 -#define __set_bit(X,Y) \
6603 -do { unsigned long __nr = (X); \
6604 - long *__m = ((long *) (Y)) + (__nr >> 6); \
6605 - *__m |= (1UL << (__nr & 63)); \
6607 -#define __clear_bit(X,Y) \
6608 -do { unsigned long __nr = (X); \
6609 - long *__m = ((long *) (Y)) + (__nr >> 6); \
6610 - *__m &= ~(1UL << (__nr & 63)); \
6612 -#define __change_bit(X,Y) \
6613 -do { unsigned long __nr = (X); \
6614 - long *__m = ((long *) (Y)) + (__nr >> 6); \
6615 - *__m ^= (1UL << (__nr & 63)); \
6617 -#define __test_and_set_bit(X,Y) \
6618 -({ unsigned long __nr = (X); \
6619 - long *__m = ((long *) (Y)) + (__nr >> 6); \
6620 - long __old = *__m; \
6621 - long __mask = (1UL << (__nr & 63)); \
6622 - *__m = (__old | __mask); \
6623 - ((__old & __mask) != 0); \
6625 -#define __test_and_clear_bit(X,Y) \
6626 -({ unsigned long __nr = (X); \
6627 - long *__m = ((long *) (Y)) + (__nr >> 6); \
6628 - long __old = *__m; \
6629 - long __mask = (1UL << (__nr & 63)); \
6630 - *__m = (__old & ~__mask); \
6631 - ((__old & __mask) != 0); \
6633 -#define __test_and_change_bit(X,Y) \
6634 -({ unsigned long __nr = (X); \
6635 - long *__m = ((long *) (Y)) + (__nr >> 6); \
6636 - long __old = *__m; \
6637 - long __mask = (1UL << (__nr & 63)); \
6638 - *__m = (__old ^ __mask); \
6639 - ((__old & __mask) != 0); \
6642 +static __inline__ void __set_bit(int nr, volatile unsigned long *addr)
6644 + volatile unsigned long *m = addr + (nr >> 6);
6646 + *m |= (1UL << (nr & 63));
6649 +static __inline__ void __clear_bit(int nr, volatile unsigned long *addr)
6651 + volatile unsigned long *m = addr + (nr >> 6);
6653 + *m &= ~(1UL << (nr & 63));
6656 +static __inline__ void __change_bit(int nr, volatile unsigned long *addr)
6658 + volatile unsigned long *m = addr + (nr >> 6);
6660 + *m ^= (1UL << (nr & 63));
6663 +static __inline__ int __test_and_set_bit(int nr, volatile unsigned long *addr)
6665 + volatile unsigned long *m = addr + (nr >> 6);
6667 + long mask = (1UL << (nr & 63));
6669 + *m = (old | mask);
6670 + return ((old & mask) != 0);
6673 +static __inline__ int __test_and_clear_bit(int nr, volatile unsigned long *addr)
6675 + volatile unsigned long *m = addr + (nr >> 6);
6677 + long mask = (1UL << (nr & 63));
6679 + *m = (old & ~mask);
6680 + return ((old & mask) != 0);
6683 +static __inline__ int __test_and_change_bit(int nr, volatile unsigned long *addr)
6685 + volatile unsigned long *m = addr + (nr >> 6);
6687 + long mask = (1UL << (nr & 63));
6689 + *m = (old ^ mask);
6690 + return ((old & mask) != 0);
6693 #define smp_mb__before_clear_bit() do { } while(0)
6694 #define smp_mb__after_clear_bit() do { } while(0)
6696 -extern __inline__ int test_bit(int nr, __const__ void *addr)
6697 +static __inline__ int test_bit(int nr, __const__ volatile unsigned long *addr)
6699 - return (1UL & (((__const__ long *) addr)[nr >> 6] >> (nr & 63))) != 0UL;
6700 + return (1UL & ((addr)[nr >> 6] >> (nr & 63))) != 0UL;
6703 /* The easy/cheese version for now. */
6704 -extern __inline__ unsigned long ffz(unsigned long word)
6705 +static __inline__ unsigned long ffz(unsigned long word)
6707 unsigned long result;
6709 -#ifdef ULTRA_HAS_POPULATION_COUNT /* Thanks for nothing Sun... */
6710 - __asm__ __volatile__(
6713 -" xnor %0, %%g1, %%g2\n"
6715 -"1: " : "=&r" (result)
6719 -#if 1 /* def EASY_CHEESE_VERSION */
6726 - unsigned long tmp;
6731 - tmp = ~word & -~word;
6732 - if (!(unsigned)tmp) {
6736 - if (!(unsigned short)tmp) {
6740 - if (!(unsigned char)tmp) {
6744 + * __ffs - find first bit in word.
6745 + * @word: The word to search
6747 + * Undefined if no bit exists, so code should check against 0 first.
6749 +static __inline__ unsigned long __ffs(unsigned long word)
6751 + unsigned long result = 0;
6753 + while (!(word & 1UL)) {
6757 - if (tmp & 0xf0) result += 4;
6758 - if (tmp & 0xcc) result += 2;
6759 - if (tmp & 0xaa) result ++;
6766 + * fls: find last bit set.
6769 +#define fls(x) generic_fls(x)
6774 + * Every architecture must define this function. It's the fastest
6775 + * way of searching a 140-bit bitmap where the first 100 bits are
6776 + * unlikely to be set. It's guaranteed that at least one of the 140
6777 + * bits is cleared.
6779 +static inline int sched_find_first_bit(unsigned long *b)
6781 + if (unlikely(b[0]))
6782 + return __ffs(b[0]);
6783 + if (unlikely(((unsigned int)b[1])))
6784 + return __ffs(b[1]) + 64;
6786 + return __ffs(b[1] >> 32) + 96;
6787 + return __ffs(b[2]) + 128;
6791 * ffs: find first bit set. This is defined the same way as
6792 * the libc and compiler builtin ffs routines, therefore
6793 * differs in spirit from the above ffz (man ffs).
6796 -#define ffs(x) generic_ffs(x)
6797 +static __inline__ int ffs(int x)
6801 + return __ffs((unsigned long)x);
6805 * hweightN: returns the hamming weight (i.e. the number
6808 #ifdef ULTRA_HAS_POPULATION_COUNT
6810 -extern __inline__ unsigned int hweight32(unsigned int w)
6811 +static __inline__ unsigned int hweight32(unsigned int w)
6819 -extern __inline__ unsigned int hweight16(unsigned int w)
6820 +static __inline__ unsigned int hweight16(unsigned int w)
6828 -extern __inline__ unsigned int hweight8(unsigned int w)
6829 +static __inline__ unsigned int hweight8(unsigned int w)
6833 @@ -165,14 +189,69 @@
6835 #endif /* __KERNEL__ */
6838 + * find_next_bit - find the next set bit in a memory region
6839 + * @addr: The address to base the search on
6840 + * @offset: The bitnumber to start searching at
6841 + * @size: The maximum size to search
6843 +static __inline__ unsigned long find_next_bit(unsigned long *addr, unsigned long size, unsigned long offset)
6845 + unsigned long *p = addr + (offset >> 6);
6846 + unsigned long result = offset & ~63UL;
6847 + unsigned long tmp;
6849 + if (offset >= size)
6855 + tmp &= (~0UL << offset);
6859 + goto found_middle;
6863 + while (size & ~63UL) {
6864 + if ((tmp = *(p++)))
6865 + goto found_middle;
6874 + tmp &= (~0UL >> (64 - size));
6875 + if (tmp == 0UL) /* Are any bits set? */
6876 + return result + size; /* Nope. */
6878 + return result + __ffs(tmp);
6882 + * find_first_bit - find the first set bit in a memory region
6883 + * @addr: The address to start the search at
6884 + * @size: The maximum size to search
6886 + * Returns the bit-number of the first set bit, not the number of the byte
6887 + * containing a bit.
6889 +#define find_first_bit(addr, size) \
6890 + find_next_bit((addr), (size), 0)
6892 /* find_next_zero_bit() finds the first zero bit in a bit string of length
6893 * 'size' bits, starting the search at bit 'offset'. This is largely based
6894 * on Linus's ALPHA routines, which are pretty portable BTW.
6897 -extern __inline__ unsigned long find_next_zero_bit(void *addr, unsigned long size, unsigned long offset)
6898 +static __inline__ unsigned long find_next_zero_bit(unsigned long *addr, unsigned long size, unsigned long offset)
6900 - unsigned long *p = ((unsigned long *) addr) + (offset >> 6);
6901 + unsigned long *p = addr + (offset >> 6);
6902 unsigned long result = offset & ~63UL;
6905 @@ -211,15 +290,15 @@
6906 #define find_first_zero_bit(addr, size) \
6907 find_next_zero_bit((addr), (size), 0)
6909 -extern long ___test_and_set_le_bit(int nr, volatile void *addr);
6910 -extern long ___test_and_clear_le_bit(int nr, volatile void *addr);
6911 +extern long ___test_and_set_le_bit(int nr, volatile unsigned long *addr);
6912 +extern long ___test_and_clear_le_bit(int nr, volatile unsigned long *addr);
6914 #define test_and_set_le_bit(nr,addr) ({___test_and_set_le_bit(nr,addr)!=0;})
6915 #define test_and_clear_le_bit(nr,addr) ({___test_and_clear_le_bit(nr,addr)!=0;})
6916 #define set_le_bit(nr,addr) ((void)___test_and_set_le_bit(nr,addr))
6917 #define clear_le_bit(nr,addr) ((void)___test_and_clear_le_bit(nr,addr))
6919 -extern __inline__ int test_le_bit(int nr, __const__ void * addr)
6920 +static __inline__ int test_le_bit(int nr, __const__ unsigned long * addr)
6923 __const__ unsigned char *ADDR = (__const__ unsigned char *) addr;
6925 #define find_first_zero_le_bit(addr, size) \
6926 find_next_zero_le_bit((addr), (size), 0)
6928 -extern __inline__ unsigned long find_next_zero_le_bit(void *addr, unsigned long size, unsigned long offset)
6929 +static __inline__ unsigned long find_next_zero_le_bit(unsigned long *addr, unsigned long size, unsigned long offset)
6931 - unsigned long *p = ((unsigned long *) addr) + (offset >> 6);
6932 + unsigned long *p = addr + (offset >> 6);
6933 unsigned long result = offset & ~63UL;
6936 @@ -271,18 +350,22 @@
6940 -#define ext2_set_bit test_and_set_le_bit
6941 -#define ext2_clear_bit test_and_clear_le_bit
6942 -#define ext2_test_bit test_le_bit
6943 -#define ext2_find_first_zero_bit find_first_zero_le_bit
6944 -#define ext2_find_next_zero_bit find_next_zero_le_bit
6945 +#define ext2_set_bit(nr,addr) test_and_set_le_bit((nr),(unsigned long *)(addr))
6946 +#define ext2_clear_bit(nr,addr) test_and_clear_le_bit((nr),(unsigned long *)(addr))
6947 +#define ext2_test_bit(nr,addr) test_le_bit((nr),(unsigned long *)(addr))
6948 +#define ext2_find_first_zero_bit(addr, size) \
6949 + find_first_zero_le_bit((unsigned long *)(addr), (size))
6950 +#define ext2_find_next_zero_bit(addr, size, off) \
6951 + find_next_zero_le_bit((unsigned long *)(addr), (size), (off))
6953 /* Bitmap functions for the minix filesystem. */
6954 -#define minix_test_and_set_bit(nr,addr) test_and_set_bit(nr,addr)
6955 -#define minix_set_bit(nr,addr) set_bit(nr,addr)
6956 -#define minix_test_and_clear_bit(nr,addr) test_and_clear_bit(nr,addr)
6957 -#define minix_test_bit(nr,addr) test_bit(nr,addr)
6958 -#define minix_find_first_zero_bit(addr,size) find_first_zero_bit(addr,size)
6959 +#define minix_test_and_set_bit(nr,addr) test_and_set_bit((nr),(unsigned long *)(addr))
6960 +#define minix_set_bit(nr,addr) set_bit((nr),(unsigned long *)(addr))
6961 +#define minix_test_and_clear_bit(nr,addr) \
6962 + test_and_clear_bit((nr),(unsigned long *)(addr))
6963 +#define minix_test_bit(nr,addr) test_bit((nr),(unsigned long *)(addr))
6964 +#define minix_find_first_zero_bit(addr,size) \
6965 + find_first_zero_bit((unsigned long *)(addr),(size))
6967 #endif /* __KERNEL__ */
6969 diff -urN linux-2.4.20/include/asm-sparc64/smp.h linux-2.4.20-o1/include/asm-sparc64/smp.h
6970 --- linux-2.4.20/include/asm-sparc64/smp.h Fri Nov 29 00:53:15 2002
6971 +++ linux-2.4.20-o1/include/asm-sparc64/smp.h Wed Mar 12 00:41:43 2003
6976 -#define smp_processor_id() (current->processor)
6977 +#define smp_processor_id() (current->cpu)
6979 /* This needn't do anything as we do not sleep the cpu
6980 * inside of the idler task, so an interrupt is not needed
6981 diff -urN linux-2.4.20/include/asm-sparc64/system.h linux-2.4.20-o1/include/asm-sparc64/system.h
6982 --- linux-2.4.20/include/asm-sparc64/system.h Sat Aug 3 02:39:45 2002
6983 +++ linux-2.4.20-o1/include/asm-sparc64/system.h Wed Mar 12 00:41:43 2003
6984 @@ -143,7 +143,18 @@
6986 #define flush_user_windows flushw_user
6987 #define flush_register_windows flushw_all
6988 -#define prepare_to_switch flushw_all
6990 +#define prepare_arch_schedule(prev) task_lock(prev)
6991 +#define finish_arch_schedule(prev) task_unlock(prev)
6992 +#define prepare_arch_switch(rq, next) \
6993 +do { spin_lock(&(next)->switch_lock); \
6994 + spin_unlock(&(rq)->lock); \
6998 +#define finish_arch_switch(rq, prev) \
6999 +do { spin_unlock_irq(&(prev)->switch_lock); \
7002 #ifndef CONFIG_DEBUG_SPINLOCK
7003 #define CHECK_LOCKS(PREV) do { } while(0)
7004 diff -urN linux-2.4.20/include/linux/kernel_stat.h linux-2.4.20-o1/include/linux/kernel_stat.h
7005 --- linux-2.4.20/include/linux/kernel_stat.h Fri Nov 29 00:53:15 2002
7006 +++ linux-2.4.20-o1/include/linux/kernel_stat.h Wed Mar 12 00:41:43 2003
7008 #elif !defined(CONFIG_ARCH_S390)
7009 unsigned int irqs[NR_CPUS][NR_IRQS];
7011 - unsigned int context_swtch;
7014 extern struct kernel_stat kstat;
7015 diff -urN linux-2.4.20/include/linux/sched.h linux-2.4.20-o1/include/linux/sched.h
7016 --- linux-2.4.20/include/linux/sched.h Fri Nov 29 00:53:15 2002
7017 +++ linux-2.4.20-o1/include/linux/sched.h Wed Mar 12 00:41:43 2003
7019 extern unsigned long event;
7021 #include <linux/config.h>
7022 +#include <linux/compiler.h>
7023 #include <linux/binfmts.h>
7024 #include <linux/threads.h>
7025 #include <linux/kernel.h>
7027 #include <asm/mmu.h>
7029 #include <linux/smp.h>
7030 -#include <linux/tty.h>
7031 +//#include <linux/tty.h>
7032 #include <linux/sem.h>
7033 #include <linux/signal.h>
7034 #include <linux/securebits.h>
7036 #define CT_TO_SECS(x) ((x) / HZ)
7037 #define CT_TO_USECS(x) (((x) % HZ) * 1000000/HZ)
7039 -extern int nr_running, nr_threads;
7040 +extern int nr_threads;
7041 extern int last_pid;
7042 +extern unsigned long nr_running(void);
7043 +extern unsigned long nr_uninterruptible(void);
7045 -#include <linux/fs.h>
7046 +//#include <linux/fs.h>
7047 #include <linux/time.h>
7048 #include <linux/param.h>
7049 #include <linux/resource.h>
7050 @@ -119,12 +122,6 @@
7051 #define SCHED_FIFO 1
7055 - * This is an additional bit set when we want to
7056 - * yield the CPU for one re-schedule..
7058 -#define SCHED_YIELD 0x10
7060 struct sched_param {
7063 @@ -142,17 +139,21 @@
7066 extern rwlock_t tasklist_lock;
7067 -extern spinlock_t runqueue_lock;
7068 extern spinlock_t mmlist_lock;
7070 +typedef struct task_struct task_t;
7072 extern void sched_init(void);
7073 -extern void init_idle(void);
7074 +extern void init_idle(task_t *idle, int cpu);
7075 extern void show_state(void);
7076 extern void cpu_init (void);
7077 extern void trap_init(void);
7078 extern void update_process_times(int user);
7079 -extern void update_one_process(struct task_struct *p, unsigned long user,
7080 +extern void update_one_process(task_t *p, unsigned long user,
7081 unsigned long system, int cpu);
7082 +extern void scheduler_tick(int user_tick, int system);
7083 +extern void migration_init(void);
7084 +extern unsigned long cache_decay_ticks;
7086 #define MAX_SCHEDULE_TIMEOUT LONG_MAX
7087 extern signed long FASTCALL(schedule_timeout(signed long timeout));
7088 @@ -162,6 +163,28 @@
7089 extern void flush_scheduled_tasks(void);
7090 extern int start_context_thread(void);
7091 extern int current_is_keventd(void);
7092 +extern void FASTCALL(sched_exit(task_t * p));
7093 +extern int FASTCALL(idle_cpu(int cpu));
7096 + * Priority of a process goes from 0..MAX_PRIO-1, valid RT
7097 + * priority is 0..MAX_RT_PRIO-1, and SCHED_OTHER tasks are
7098 + * in the range MAX_RT_PRIO..MAX_PRIO-1. Priority values
7099 + * are inverted: lower p->prio value means higher priority.
7101 + * The MAX_RT_USER_PRIO value allows the actual maximum
7102 + * RT priority to be separate from the value exported to
7103 + * user-space. This allows kernel threads to set their
7104 + * priority to a value higher than any user task. Note:
7105 + * MAX_RT_PRIO must not be smaller than MAX_USER_RT_PRIO.
7107 + * Both values are configurable at compile-time.
7110 +#define MAX_USER_RT_PRIO 100
7111 +#define MAX_RT_PRIO MAX_USER_RT_PRIO
7113 +#define MAX_PRIO (MAX_RT_PRIO + 40)
7116 * The default fd array needs to be at least BITS_PER_LONG,
7118 extern struct user_struct root_user;
7119 #define INIT_USER (&root_user)
7121 +typedef struct prio_array prio_array_t;
7123 struct task_struct {
7125 * offsets of these are hardcoded elsewhere - touch with care
7126 @@ -301,35 +326,26 @@
7128 int lock_depth; /* Lock depth */
7131 - * offset 32 begins here on 32-bit platforms. We keep
7132 - * all fields in a single cacheline that are needed for
7133 - * the goodness() loop in schedule().
7137 - unsigned long policy;
7138 - struct mm_struct *mm;
7141 - * cpus_runnable is ~0 if the process is not running on any
7142 - * CPU. It's (1 << cpu) if it's running on a CPU. This mask
7143 - * is updated under the runqueue lock.
7145 - * To determine whether a process might run on a CPU, this
7146 - * mask is AND-ed with cpus_allowed.
7147 + * offset 32 begins here on 32-bit platforms.
7149 - unsigned long cpus_runnable, cpus_allowed;
7151 - * (only the 'next' pointer fits into the cacheline, but
7152 - * that's just fine.)
7154 - struct list_head run_list;
7155 - unsigned long sleep_time;
7157 + int prio, static_prio;
7159 + prio_array_t *array;
7161 - struct task_struct *next_task, *prev_task;
7162 - struct mm_struct *active_mm;
7163 + unsigned long sleep_avg;
7164 + unsigned long sleep_timestamp;
7166 + unsigned long policy;
7167 + unsigned long cpus_allowed;
7168 + unsigned int time_slice, first_time_slice;
7170 + task_t *next_task, *prev_task;
7172 + struct mm_struct *mm, *active_mm;
7173 struct list_head local_pages;
7175 unsigned int allocation_order, nr_local_pages;
7178 @@ -351,12 +367,12 @@
7179 * older sibling, respectively. (p->father can be replaced with
7182 - struct task_struct *p_opptr, *p_pptr, *p_cptr, *p_ysptr, *p_osptr;
7183 + task_t *p_opptr, *p_pptr, *p_cptr, *p_ysptr, *p_osptr;
7184 struct list_head thread_group;
7186 /* PID hash table linkage. */
7187 - struct task_struct *pidhash_next;
7188 - struct task_struct **pidhash_pprev;
7189 + task_t *pidhash_next;
7190 + task_t **pidhash_pprev;
7192 wait_queue_head_t wait_chldexit; /* for wait4() */
7193 struct completion *vfork_done; /* for vfork() */
7196 /* Protection of (de-)allocation: mm, files, fs, tty */
7197 spinlock_t alloc_lock;
7198 +/* context-switch lock */
7199 + spinlock_t switch_lock;
7201 /* journalling filesystem info */
7203 @@ -454,9 +472,15 @@
7205 #define _STK_LIM (8*1024*1024)
7207 -#define DEF_COUNTER (10*HZ/100) /* 100 ms time slice */
7208 -#define MAX_COUNTER (20*HZ/100)
7209 -#define DEF_NICE (0)
7211 +extern void set_cpus_allowed(task_t *p, unsigned long new_mask);
7213 +#define set_cpus_allowed(p, new_mask) do { } while (0)
7216 +extern void set_user_nice(task_t *p, long nice);
7217 +extern int task_prio(task_t *p);
7218 +extern int task_nice(task_t *p);
7220 extern void yield(void);
7222 @@ -477,14 +501,14 @@
7223 addr_limit: KERNEL_DS, \
7224 exec_domain: &default_exec_domain, \
7226 - counter: DEF_COUNTER, \
7228 + prio: MAX_PRIO-20, \
7229 + static_prio: MAX_PRIO-20, \
7230 policy: SCHED_OTHER, \
7231 + cpus_allowed: -1, \
7233 active_mm: &init_mm, \
7234 - cpus_runnable: -1, \
7235 - cpus_allowed: -1, \
7236 run_list: LIST_HEAD_INIT(tsk.run_list), \
7242 pending: { NULL, &tsk.pending.head, {{0}}}, \
7244 alloc_lock: SPIN_LOCK_UNLOCKED, \
7245 + switch_lock: SPIN_LOCK_UNLOCKED, \
7246 journal_info: NULL, \
7249 @@ -518,24 +543,23 @@
7253 - struct task_struct task;
7255 unsigned long stack[INIT_TASK_SIZE/sizeof(long)];
7258 extern union task_union init_task_union;
7260 extern struct mm_struct init_mm;
7261 -extern struct task_struct *init_tasks[NR_CPUS];
7263 /* PID hashing. (shouldnt this be dynamic?) */
7264 #define PIDHASH_SZ (4096 >> 2)
7265 -extern struct task_struct *pidhash[PIDHASH_SZ];
7266 +extern task_t *pidhash[PIDHASH_SZ];
7268 #define pid_hashfn(x) ((((x) >> 8) ^ (x)) & (PIDHASH_SZ - 1))
7270 -static inline void hash_pid(struct task_struct *p)
7271 +static inline void hash_pid(task_t *p)
7273 - struct task_struct **htable = &pidhash[pid_hashfn(p->pid)];
7274 + task_t **htable = &pidhash[pid_hashfn(p->pid)];
7276 if((p->pidhash_next = *htable) != NULL)
7277 (*htable)->pidhash_pprev = &p->pidhash_next;
7278 @@ -543,16 +567,16 @@
7279 p->pidhash_pprev = htable;
7282 -static inline void unhash_pid(struct task_struct *p)
7283 +static inline void unhash_pid(task_t *p)
7286 p->pidhash_next->pidhash_pprev = p->pidhash_pprev;
7287 *p->pidhash_pprev = p->pidhash_next;
7290 -static inline struct task_struct *find_task_by_pid(int pid)
7291 +static inline task_t *find_task_by_pid(int pid)
7293 - struct task_struct *p, **htable = &pidhash[pid_hashfn(pid)];
7294 + task_t *p, **htable = &pidhash[pid_hashfn(pid)];
7296 for(p = *htable; p && p->pid != pid; p = p->pidhash_next)
7298 @@ -560,19 +584,6 @@
7302 -#define task_has_cpu(tsk) ((tsk)->cpus_runnable != ~0UL)
7304 -static inline void task_set_cpu(struct task_struct *tsk, unsigned int cpu)
7306 - tsk->processor = cpu;
7307 - tsk->cpus_runnable = 1UL << cpu;
7310 -static inline void task_release_cpu(struct task_struct *tsk)
7312 - tsk->cpus_runnable = ~0UL;
7315 /* per-UID process charging. */
7316 extern struct user_struct * alloc_uid(uid_t);
7317 extern void free_uid(struct user_struct *);
7318 @@ -599,47 +610,50 @@
7319 extern void FASTCALL(interruptible_sleep_on(wait_queue_head_t *q));
7320 extern long FASTCALL(interruptible_sleep_on_timeout(wait_queue_head_t *q,
7321 signed long timeout));
7322 -extern int FASTCALL(wake_up_process(struct task_struct * tsk));
7323 +extern int FASTCALL(wake_up_process(task_t * p));
7324 +extern void FASTCALL(wake_up_forked_process(task_t * p));
7326 #define wake_up(x) __wake_up((x),TASK_UNINTERRUPTIBLE | TASK_INTERRUPTIBLE, 1)
7327 #define wake_up_nr(x, nr) __wake_up((x),TASK_UNINTERRUPTIBLE | TASK_INTERRUPTIBLE, nr)
7328 #define wake_up_all(x) __wake_up((x),TASK_UNINTERRUPTIBLE | TASK_INTERRUPTIBLE, 0)
7329 -#define wake_up_sync(x) __wake_up_sync((x),TASK_UNINTERRUPTIBLE | TASK_INTERRUPTIBLE, 1)
7330 -#define wake_up_sync_nr(x, nr) __wake_up_sync((x),TASK_UNINTERRUPTIBLE | TASK_INTERRUPTIBLE, nr)
7331 #define wake_up_interruptible(x) __wake_up((x),TASK_INTERRUPTIBLE, 1)
7332 #define wake_up_interruptible_nr(x, nr) __wake_up((x),TASK_INTERRUPTIBLE, nr)
7333 #define wake_up_interruptible_all(x) __wake_up((x),TASK_INTERRUPTIBLE, 0)
7334 -#define wake_up_interruptible_sync(x) __wake_up_sync((x),TASK_INTERRUPTIBLE, 1)
7335 -#define wake_up_interruptible_sync_nr(x, nr) __wake_up_sync((x),TASK_INTERRUPTIBLE, nr)
7337 +#define wake_up_interruptible_sync(x) __wake_up_sync((x),TASK_INTERRUPTIBLE, 1)
7339 +#define wake_up_interruptible_sync(x) __wake_up((x),TASK_INTERRUPTIBLE, 1)
7342 asmlinkage long sys_wait4(pid_t pid,unsigned int * stat_addr, int options, struct rusage * ru);
7344 extern int in_group_p(gid_t);
7345 extern int in_egroup_p(gid_t);
7347 extern void proc_caches_init(void);
7348 -extern void flush_signals(struct task_struct *);
7349 -extern void flush_signal_handlers(struct task_struct *);
7350 +extern void flush_signals(task_t *);
7351 +extern void flush_signal_handlers(task_t *);
7352 extern void sig_exit(int, int, struct siginfo *);
7353 extern int dequeue_signal(sigset_t *, siginfo_t *);
7354 extern void block_all_signals(int (*notifier)(void *priv), void *priv,
7356 extern void unblock_all_signals(void);
7357 -extern int send_sig_info(int, struct siginfo *, struct task_struct *);
7358 -extern int force_sig_info(int, struct siginfo *, struct task_struct *);
7359 +extern int send_sig_info(int, struct siginfo *, task_t *);
7360 +extern int force_sig_info(int, struct siginfo *, task_t *);
7361 extern int kill_pg_info(int, struct siginfo *, pid_t);
7362 extern int kill_sl_info(int, struct siginfo *, pid_t);
7363 extern int kill_proc_info(int, struct siginfo *, pid_t);
7364 -extern void notify_parent(struct task_struct *, int);
7365 -extern void do_notify_parent(struct task_struct *, int);
7366 -extern void force_sig(int, struct task_struct *);
7367 -extern int send_sig(int, struct task_struct *, int);
7368 +extern void notify_parent(task_t *, int);
7369 +extern void do_notify_parent(task_t *, int);
7370 +extern void force_sig(int, task_t *);
7371 +extern int send_sig(int, task_t *, int);
7372 extern int kill_pg(pid_t, int, int);
7373 extern int kill_sl(pid_t, int, int);
7374 extern int kill_proc(pid_t, int, int);
7375 extern int do_sigaction(int, const struct k_sigaction *, struct k_sigaction *);
7376 extern int do_sigaltstack(const stack_t *, stack_t *, unsigned long);
7378 -static inline int signal_pending(struct task_struct *p)
7379 +static inline int signal_pending(task_t *p)
7381 return (p->sigpending != 0);
7384 This is required every time the blocked sigset_t changes.
7385 All callers should have t->sigmask_lock. */
7387 -static inline void recalc_sigpending(struct task_struct *t)
7388 +static inline void recalc_sigpending(task_t *t)
7390 t->sigpending = has_pending_signals(&t->pending.signal, &t->blocked);
7392 @@ -785,16 +799,17 @@
7393 extern int expand_fdset(struct files_struct *, int nr);
7394 extern void free_fdset(fd_set *, int);
7396 -extern int copy_thread(int, unsigned long, unsigned long, unsigned long, struct task_struct *, struct pt_regs *);
7397 +extern int copy_thread(int, unsigned long, unsigned long, unsigned long, task_t *, struct pt_regs *);
7398 extern void flush_thread(void);
7399 extern void exit_thread(void);
7401 -extern void exit_mm(struct task_struct *);
7402 -extern void exit_files(struct task_struct *);
7403 -extern void exit_sighand(struct task_struct *);
7404 +extern void exit_mm(task_t *);
7405 +extern void exit_files(task_t *);
7406 +extern void exit_sighand(task_t *);
7408 extern void reparent_to_init(void);
7409 extern void daemonize(void);
7410 +extern task_t *child_reaper;
7412 extern int do_execve(char *, char **, char **, struct pt_regs *);
7413 extern int do_fork(unsigned long, unsigned long, struct pt_regs *, unsigned long);
7415 extern void FASTCALL(add_wait_queue_exclusive(wait_queue_head_t *q, wait_queue_t * wait));
7416 extern void FASTCALL(remove_wait_queue(wait_queue_head_t *q, wait_queue_t * wait));
7418 +extern void wait_task_inactive(task_t * p);
7419 +extern void kick_if_running(task_t * p);
7421 #define __wait_event(wq, condition) \
7423 wait_queue_t __wait; \
7424 @@ -884,27 +902,12 @@
7425 for (task = next_thread(current) ; task != current ; task = next_thread(task))
7427 #define next_thread(p) \
7428 - list_entry((p)->thread_group.next, struct task_struct, thread_group)
7429 + list_entry((p)->thread_group.next, task_t, thread_group)
7431 #define thread_group_leader(p) (p->pid == p->tgid)
7433 -static inline void del_from_runqueue(struct task_struct * p)
7434 +static inline void unhash_process(task_t *p)
7437 - p->sleep_time = jiffies;
7438 - list_del(&p->run_list);
7439 - p->run_list.next = NULL;
7442 -static inline int task_on_runqueue(struct task_struct *p)
7444 - return (p->run_list.next != NULL);
7447 -static inline void unhash_process(struct task_struct *p)
7449 - if (task_on_runqueue(p))
7450 - out_of_line_bug();
7451 write_lock_irq(&tasklist_lock);
7454 @@ -914,12 +917,12 @@
7457 /* Protects ->fs, ->files, ->mm, and synchronises with wait4(). Nests inside tasklist_lock */
7458 -static inline void task_lock(struct task_struct *p)
7459 +static inline void task_lock(task_t *p)
7461 spin_lock(&p->alloc_lock);
7464 -static inline void task_unlock(struct task_struct *p)
7465 +static inline void task_unlock(task_t *p)
7467 spin_unlock(&p->alloc_lock);
7469 @@ -943,6 +946,26 @@
7473 +static inline void set_need_resched(void)
7475 + current->need_resched = 1;
7478 +static inline void clear_need_resched(void)
7480 + current->need_resched = 0;
7483 +static inline void set_tsk_need_resched(task_t *tsk)
7485 + tsk->need_resched = 1;
7488 +static inline void clear_tsk_need_resched(task_t *tsk)
7490 + tsk->need_resched = 0;
7493 static inline int need_resched(void)
7495 return (unlikely(current->need_resched));
7499 #endif /* __KERNEL__ */
7502 diff -urN linux-2.4.20/include/linux/smp.h linux-2.4.20-o1/include/linux/smp.h
7503 --- linux-2.4.20/include/linux/smp.h Thu Nov 22 20:46:19 2001
7504 +++ linux-2.4.20-o1/include/linux/smp.h Wed Mar 12 00:41:43 2003
7506 #define cpu_number_map(cpu) 0
7507 #define smp_call_function(func,info,retry,wait) ({ 0; })
7508 #define cpu_online_map 1
7509 +static inline void smp_send_reschedule(int cpu) { }
7510 +static inline void smp_send_reschedule_all(void) { }
7515 + * Common definitions:
7517 +#define cpu() smp_processor_id()
7520 diff -urN linux-2.4.20/include/linux/smp_balance.h linux-2.4.20-o1/include/linux/smp_balance.h
7521 --- linux-2.4.20/include/linux/smp_balance.h Thu Jan 1 01:00:00 1970
7522 +++ linux-2.4.20-o1/include/linux/smp_balance.h Wed Mar 12 00:41:43 2003
7524 +#ifndef _LINUX_SMP_BALANCE_H
7525 +#define _LINUX_SMP_BALANCE_H
7528 + * per-architecture load balancing logic, e.g. for hyperthreading
7531 +#ifdef ARCH_HAS_SMP_BALANCE
7532 +#include <asm/smp_balance.h>
7534 +#define arch_load_balance(x, y) (0)
7535 +#define arch_reschedule_idle_override(x, idle) (idle)
7538 +#endif /* _LINUX_SMP_BALANCE_H */
7539 diff -urN linux-2.4.20/include/linux/wait.h linux-2.4.20-o1/include/linux/wait.h
7540 --- linux-2.4.20/include/linux/wait.h Thu Nov 22 20:46:19 2001
7541 +++ linux-2.4.20-o1/include/linux/wait.h Wed Mar 12 00:41:43 2003
7543 # define wq_write_lock_irq write_lock_irq
7544 # define wq_write_lock_irqsave write_lock_irqsave
7545 # define wq_write_unlock_irqrestore write_unlock_irqrestore
7546 +# define wq_write_unlock_irq write_unlock_irq
7547 # define wq_write_unlock write_unlock
7549 # define wq_lock_t spinlock_t
7551 # define wq_write_lock_irq spin_lock_irq
7552 # define wq_write_lock_irqsave spin_lock_irqsave
7553 # define wq_write_unlock_irqrestore spin_unlock_irqrestore
7554 +# define wq_write_unlock_irq spin_unlock_irq
7555 # define wq_write_unlock spin_unlock
7558 diff -urN linux-2.4.20/init/main.c linux-2.4.20-o1/init/main.c
7559 --- linux-2.4.20/init/main.c Sat Aug 3 02:39:46 2002
7560 +++ linux-2.4.20-o1/init/main.c Wed Mar 12 00:41:43 2003
7562 extern void setup_arch(char **);
7563 extern void cpu_idle(void);
7565 -unsigned long wait_init_idle;
7569 #ifdef CONFIG_X86_LOCAL_APIC
7570 @@ -298,34 +296,24 @@
7571 APIC_init_uniprocessor();
7574 -#define smp_init() do { } while (0)
7575 +#define smp_init() do { } while (0)
7581 /* Called by boot processor to activate the rest. */
7582 static void __init smp_init(void)
7584 /* Get other processors into their bootup holding patterns. */
7586 - wait_init_idle = cpu_online_map;
7587 - clear_bit(current->processor, &wait_init_idle); /* Don't wait on me! */
7589 smp_threads_ready=1;
7592 - /* Wait for the other cpus to set up their idle processes */
7593 - printk("Waiting on wait_init_idle (map = 0x%lx)\n", wait_init_idle);
7594 - while (wait_init_idle) {
7598 - printk("All processors have done init_idle\n");
7605 * We need to finalize in a non-__init function or else race conditions
7606 * between the root thread and the init thread may cause start_kernel to
7609 kernel_thread(init, NULL, CLONE_FS | CLONE_FILES | CLONE_SIGNAL);
7611 - current->need_resched = 1;
7618 * Activate the first processor.
7619 @@ -424,14 +411,18 @@
7624 printk("POSIX conformance testing by UNIFIX\n");
7627 - * We count on the initial thread going ok
7628 - * Like idlers init is an unlocked kernel thread, which will
7629 - * make syscalls (and thus be locked).
7630 + init_idle(current, smp_processor_id());
7632 + * We count on the initial thread going ok
7633 + * Like idlers init is an unlocked kernel thread, which will
7634 + * make syscalls (and thus be locked).
7638 + /* Do the rest non-__init'ed, we're now alive */
7642 @@ -460,6 +451,10 @@
7644 static void __init do_basic_setup(void)
7646 + /* Start the per-CPU migration threads */
7652 * Tell the world that we're going to be the grim
7653 diff -urN linux-2.4.20/kernel/capability.c linux-2.4.20-o1/kernel/capability.c
7654 --- linux-2.4.20/kernel/capability.c Sat Jun 24 06:06:37 2000
7655 +++ linux-2.4.20-o1/kernel/capability.c Wed Mar 12 00:41:43 2003
7657 #include <linux/mm.h>
7658 #include <asm/uaccess.h>
7660 +unsigned securebits = SECUREBITS_DEFAULT; /* systemwide security settings */
7662 kernel_cap_t cap_bset = CAP_INIT_EFF_SET;
7664 /* Note: never hold tasklist_lock while spinning for this one */
7665 diff -urN linux-2.4.20/kernel/exit.c linux-2.4.20-o1/kernel/exit.c
7666 --- linux-2.4.20/kernel/exit.c Fri Nov 29 00:53:15 2002
7667 +++ linux-2.4.20-o1/kernel/exit.c Wed Mar 12 00:41:43 2003
7670 static void release_task(struct task_struct * p)
7672 - if (p != current) {
7677 - * Wait to make sure the process isn't on the
7678 - * runqueue (active on some other CPU still)
7682 - if (!task_has_cpu(p))
7688 - } while (task_has_cpu(p));
7691 + wait_task_inactive(p);
7693 - atomic_dec(&p->user->processes);
7694 - free_uid(p->user);
7695 - unhash_process(p);
7697 - release_thread(p);
7698 - current->cmin_flt += p->min_flt + p->cmin_flt;
7699 - current->cmaj_flt += p->maj_flt + p->cmaj_flt;
7700 - current->cnswap += p->nswap + p->cnswap;
7702 - * Potentially available timeslices are retrieved
7703 - * here - this way the parent does not get penalized
7704 - * for creating too many processes.
7706 - * (this cannot be used to artificially 'generate'
7707 - * timeslices, because any timeslice recovered here
7708 - * was given away by the parent in the first place.)
7710 - current->counter += p->counter;
7711 - if (current->counter >= MAX_COUNTER)
7712 - current->counter = MAX_COUNTER;
7714 - free_task_struct(p);
7716 - printk("task releasing itself\n");
7718 + atomic_dec(&p->user->processes);
7719 + free_uid(p->user);
7720 + unhash_process(p);
7722 + release_thread(p);
7723 + current->cmin_flt += p->min_flt + p->cmin_flt;
7724 + current->cmaj_flt += p->maj_flt + p->cmaj_flt;
7725 + current->cnswap += p->nswap + p->cnswap;
7728 + free_task_struct(p);
7732 @@ -150,6 +123,79 @@
7737 + * reparent_to_init() - Reparent the calling kernel thread to the init task.
7739 + * If a kernel thread is launched as a result of a system call, or if
7740 + * it ever exits, it should generally reparent itself to init so that
7741 + * it is correctly cleaned up on exit.
7743 + * The various task state such as scheduling policy and priority may have
7744 + * been inherited from a user process, so we reset them to sane values here.
7746 + * NOTE that reparent_to_init() gives the caller full capabilities.
7748 +void reparent_to_init(void)
7750 + write_lock_irq(&tasklist_lock);
7752 + /* Reparent to init */
7753 + REMOVE_LINKS(current);
7754 + current->p_pptr = child_reaper;
7755 + current->p_opptr = child_reaper;
7756 + SET_LINKS(current);
7758 + /* Set the exit signal to SIGCHLD so we signal init on exit */
7759 + current->exit_signal = SIGCHLD;
7761 + current->ptrace = 0;
7762 + if ((current->policy == SCHED_OTHER) && (task_nice(current) < 0))
7763 + set_user_nice(current, 0);
7764 + /* cpus_allowed? */
7765 + /* rt_priority? */
7767 + current->cap_effective = CAP_INIT_EFF_SET;
7768 + current->cap_inheritable = CAP_INIT_INH_SET;
7769 + current->cap_permitted = CAP_FULL_SET;
7770 + current->keep_capabilities = 0;
7771 + memcpy(current->rlim, init_task.rlim, sizeof(*(current->rlim)));
7772 + current->user = INIT_USER;
7774 + write_unlock_irq(&tasklist_lock);
7778 + * Put all the gunge required to become a kernel thread without
7779 + * attached user resources in one place where it belongs.
7782 +void daemonize(void)
7784 + struct fs_struct *fs;
7788 + * If we were started as result of loading a module, close all of the
7789 + * user space pages. We don't need them, and if we didn't close them
7790 + * they would be locked into memory.
7794 + current->session = 1;
7795 + current->pgrp = 1;
7796 + current->tty = NULL;
7798 + /* Become as one with the init task */
7800 + exit_fs(current); /* current->fs->count--; */
7801 + fs = init_task.fs;
7803 + atomic_inc(&fs->count);
7804 + exit_files(current);
7805 + current->files = init_task.files;
7806 + atomic_inc(¤t->files->count);
7810 * When we die, we re-parent all our children.
7811 * Try to give them to another thread in our thread
7813 /* Make sure we're not reparenting to ourselves */
7814 p->p_opptr = child_reaper;
7816 + p->first_time_slice = 0;
7817 if (p->pdeath_signal) send_sig(p->pdeath_signal, p, 0);
7820 diff -urN linux-2.4.20/kernel/fork.c linux-2.4.20-o1/kernel/fork.c
7821 --- linux-2.4.20/kernel/fork.c Fri Nov 29 00:53:15 2002
7822 +++ linux-2.4.20-o1/kernel/fork.c Wed Mar 12 00:41:43 2003
7825 /* The idle threads do not count.. */
7830 unsigned long total_forks; /* Handle normal Linux uptimes. */
7833 struct task_struct *pidhash[PIDHASH_SZ];
7835 +rwlock_t tasklist_lock __cacheline_aligned = RW_LOCK_UNLOCKED; /* outer */
7837 void add_wait_queue(wait_queue_head_t *q, wait_queue_t * wait)
7839 unsigned long flags;
7841 if (p->pid == 0 && current->pid != 0)
7842 goto bad_fork_cleanup;
7844 - p->run_list.next = NULL;
7845 - p->run_list.prev = NULL;
7848 init_waitqueue_head(&p->wait_chldexit);
7849 p->vfork_done = NULL;
7851 init_completion(&vfork);
7853 spin_lock_init(&p->alloc_lock);
7854 + spin_lock_init(&p->switch_lock);
7857 init_sigpending(&p->pending);
7858 @@ -665,11 +664,11 @@
7862 - p->cpus_runnable = ~0UL;
7863 - p->processor = current->processor;
7865 /* ?? should we just memset this ?? */
7866 for(i = 0; i < smp_num_cpus; i++)
7867 - p->per_cpu_utime[i] = p->per_cpu_stime[i] = 0;
7868 + p->per_cpu_utime[cpu_logical_map(i)] =
7869 + p->per_cpu_stime[cpu_logical_map(i)] = 0;
7870 spin_lock_init(&p->sigmask_lock);
7873 @@ -706,15 +705,27 @@
7874 p->pdeath_signal = 0;
7877 - * "share" dynamic priority between parent and child, thus the
7878 - * total amount of dynamic priorities in the system doesn't change,
7879 - * more scheduling fairness. This is only important in the first
7880 - * timeslice, on the long run the scheduling behaviour is unchanged.
7882 - p->counter = (current->counter + 1) >> 1;
7883 - current->counter >>= 1;
7884 - if (!current->counter)
7885 - current->need_resched = 1;
7886 + * Share the timeslice between parent and child, thus the
7887 + * total amount of pending timeslices in the system doesnt change,
7888 + * resulting in more scheduling fairness.
7891 + if (!current->time_slice)
7893 + p->time_slice = (current->time_slice + 1) >> 1;
7894 + current->time_slice >>= 1;
7895 + p->first_time_slice = 1;
7896 + if (!current->time_slice) {
7898 + * This case is rare, it happens when the parent has only
7899 + * a single jiffy left from its timeslice. Taking the
7900 + * runqueue lock is not a problem.
7902 + current->time_slice = 1;
7903 + scheduler_tick(0,0);
7905 + p->sleep_timestamp = jiffies;
7909 * Ok, add it to the run-queues and make it
7910 @@ -750,11 +761,16 @@
7912 if (p->ptrace & PT_PTRACED)
7913 send_sig(SIGSTOP, p, 1);
7915 - wake_up_process(p); /* do this last */
7916 + wake_up_forked_process(p); /* do this last */
7918 if (clone_flags & CLONE_VFORK)
7919 wait_for_completion(&vfork);
7922 + * Let the child process run first, to avoid most of the
7923 + * COW overhead when the child exec()s afterwards.
7925 + current->need_resched = 1;
7929 diff -urN linux-2.4.20/kernel/ksyms.c linux-2.4.20-o1/kernel/ksyms.c
7930 --- linux-2.4.20/kernel/ksyms.c Fri Nov 29 00:53:15 2002
7931 +++ linux-2.4.20-o1/kernel/ksyms.c Wed Mar 12 00:41:43 2003
7933 /* process management */
7934 EXPORT_SYMBOL(complete_and_exit);
7935 EXPORT_SYMBOL(__wake_up);
7936 -EXPORT_SYMBOL(__wake_up_sync);
7937 EXPORT_SYMBOL(wake_up_process);
7938 EXPORT_SYMBOL(sleep_on);
7939 EXPORT_SYMBOL(sleep_on_timeout);
7940 @@ -453,6 +452,11 @@
7941 EXPORT_SYMBOL(schedule_timeout);
7942 EXPORT_SYMBOL(yield);
7943 EXPORT_SYMBOL(__cond_resched);
7944 +EXPORT_SYMBOL(set_user_nice);
7946 +EXPORT_SYMBOL_GPL(set_cpus_allowed);
7948 +EXPORT_SYMBOL(nr_context_switches);
7949 EXPORT_SYMBOL(jiffies);
7950 EXPORT_SYMBOL(xtime);
7951 EXPORT_SYMBOL(do_gettimeofday);
7955 EXPORT_SYMBOL(kstat);
7956 -EXPORT_SYMBOL(nr_running);
7959 EXPORT_SYMBOL(panic);
7960 diff -urN linux-2.4.20/kernel/printk.c linux-2.4.20-o1/kernel/printk.c
7961 --- linux-2.4.20/kernel/printk.c Sat Aug 3 02:39:46 2002
7962 +++ linux-2.4.20-o1/kernel/printk.c Wed Mar 12 00:41:43 2003
7964 #include <linux/module.h>
7965 #include <linux/interrupt.h> /* For in_interrupt() */
7966 #include <linux/config.h>
7967 +#include <linux/delay.h>
7969 #include <asm/uaccess.h>
7971 diff -urN linux-2.4.20/kernel/ptrace.c linux-2.4.20-o1/kernel/ptrace.c
7972 --- linux-2.4.20/kernel/ptrace.c Sat Aug 3 02:39:46 2002
7973 +++ linux-2.4.20-o1/kernel/ptrace.c Wed Mar 12 00:41:43 2003
7975 if (child->state != TASK_STOPPED)
7978 - /* Make sure the child gets off its CPU.. */
7981 - if (!task_has_cpu(child))
7983 - task_unlock(child);
7985 - if (child->state != TASK_STOPPED)
7989 - } while (task_has_cpu(child));
7991 - task_unlock(child);
7992 + wait_task_inactive(child);
7996 diff -urN linux-2.4.20/kernel/sched.c linux-2.4.20-o1/kernel/sched.c
7997 --- linux-2.4.20/kernel/sched.c Fri Nov 29 00:53:15 2002
7998 +++ linux-2.4.20-o1/kernel/sched.c Wed Mar 12 00:41:43 2003
8001 * Kernel scheduler and related syscalls
8003 - * Copyright (C) 1991, 1992 Linus Torvalds
8004 + * Copyright (C) 1991-2002 Linus Torvalds
8006 * 1996-12-23 Modified by Dave Grothe to fix bugs in semaphores and
8007 * make semaphores SMP safe
8008 * 1998-11-19 Implemented schedule_timeout() and related stuff
8009 * by Andrea Arcangeli
8010 - * 1998-12-28 Implemented better SMP scheduling by Ingo Molnar
8011 + * 2002-01-04 New ultra-scalable O(1) scheduler by Ingo Molnar:
8012 + * hybrid priority-list and round-robin design with
8013 + * an array-switch method of distributing timeslices
8014 + * and per-CPU runqueues. Additional code by Davide
8015 + * Libenzi, Robert Love, and Rusty Russell.
8019 - * 'sched.c' is the main kernel file. It contains scheduling primitives
8020 - * (sleep_on, wakeup, schedule etc) as well as a number of simple system
8021 - * call functions (type getpid()), which just extract a field from
8025 -#include <linux/config.h>
8026 #include <linux/mm.h>
8027 -#include <linux/init.h>
8028 -#include <linux/smp_lock.h>
8029 #include <linux/nmi.h>
8030 #include <linux/interrupt.h>
8031 -#include <linux/kernel_stat.h>
8032 -#include <linux/completion.h>
8033 -#include <linux/prefetch.h>
8034 -#include <linux/compiler.h>
8036 +#include <linux/init.h>
8037 #include <asm/uaccess.h>
8038 +#include <linux/smp_lock.h>
8039 #include <asm/mmu_context.h>
8041 -extern void timer_bh(void);
8042 -extern void tqueue_bh(void);
8043 -extern void immediate_bh(void);
8044 +#include <linux/kernel_stat.h>
8045 +#include <linux/completion.h>
8048 - * scheduler variables
8049 + * Convert user-nice values [ -20 ... 0 ... 19 ]
8050 + * to static priority [ MAX_RT_PRIO..MAX_PRIO-1 ],
8053 +#define NICE_TO_PRIO(nice) (MAX_RT_PRIO + (nice) + 20)
8054 +#define PRIO_TO_NICE(prio) ((prio) - MAX_RT_PRIO - 20)
8055 +#define TASK_NICE(p) PRIO_TO_NICE((p)->static_prio)
8057 -unsigned securebits = SECUREBITS_DEFAULT; /* systemwide security settings */
8059 -extern void mem_use(void);
8061 + * 'User priority' is the nice value converted to something we
8062 + * can work with better when scaling various scheduler parameters,
8063 + * it's a [ 0 ... 39 ] range.
8065 +#define USER_PRIO(p) ((p)-MAX_RT_PRIO)
8066 +#define TASK_USER_PRIO(p) USER_PRIO((p)->static_prio)
8067 +#define MAX_USER_PRIO (USER_PRIO(MAX_PRIO))
8070 - * Scheduling quanta.
8071 + * These are the 'tuning knobs' of the scheduler:
8073 - * NOTE! The unix "nice" value influences how long a process
8074 - * gets. The nice value ranges from -20 to +19, where a -20
8075 - * is a "high-priority" task, and a "+10" is a low-priority
8078 - * We want the time-slice to be around 50ms or so, so this
8079 - * calculation depends on the value of HZ.
8080 + * Minimum timeslice is 10 msecs, default timeslice is 150 msecs,
8081 + * maximum timeslice is 300 msecs. Timeslices get refilled after
8085 -#define TICK_SCALE(x) ((x) >> 2)
8087 -#define TICK_SCALE(x) ((x) >> 1)
8089 -#define TICK_SCALE(x) (x)
8091 -#define TICK_SCALE(x) ((x) << 1)
8093 -#define TICK_SCALE(x) ((x) << 2)
8096 -#define NICE_TO_TICKS(nice) (TICK_SCALE(20-(nice))+1)
8098 +#define MIN_TIMESLICE ( 10 * HZ / 1000)
8099 +#define MAX_TIMESLICE (300 * HZ / 1000)
8100 +#define CHILD_PENALTY 50
8101 +#define PARENT_PENALTY 100
8102 +#define PRIO_BONUS_RATIO 25
8103 +#define INTERACTIVE_DELTA 2
8104 +#define MAX_SLEEP_AVG (2*HZ)
8105 +#define STARVATION_LIMIT (2*HZ)
8108 - * Init task must be ok at boot for the ix86 as we will check its signals
8109 - * via the SMP irq return path.
8110 + * If a task is 'interactive' then we reinsert it in the active
8111 + * array after it has expired its current timeslice. (it will not
8112 + * continue to run immediately, it will still roundrobin with
8113 + * other interactive tasks.)
8115 + * This part scales the interactivity limit depending on niceness.
8117 + * We scale it linearly, offset by the INTERACTIVE_DELTA delta.
8118 + * Here are a few examples of different nice levels:
8120 + * TASK_INTERACTIVE(-20): [1,1,1,1,1,1,1,1,1,0,0]
8121 + * TASK_INTERACTIVE(-10): [1,1,1,1,1,1,1,0,0,0,0]
8122 + * TASK_INTERACTIVE( 0): [1,1,1,1,0,0,0,0,0,0,0]
8123 + * TASK_INTERACTIVE( 10): [1,1,0,0,0,0,0,0,0,0,0]
8124 + * TASK_INTERACTIVE( 19): [0,0,0,0,0,0,0,0,0,0,0]
8126 + * (the X axis represents the possible -5 ... 0 ... +5 dynamic
8127 + * priority range a task can explore, a value of '1' means the
8128 + * task is rated interactive.)
8130 + * Ie. nice +19 tasks can never get 'interactive' enough to be
8131 + * reinserted into the active array. And only heavily CPU-hog nice -20
8132 + * tasks will be expired. Default nice 0 tasks are somewhere between,
8133 + * it takes some effort for them to get interactive, but it's not
8137 -struct task_struct * init_tasks[NR_CPUS] = {&init_task, };
8139 +#define SCALE(v1,v1_max,v2_max) \
8140 + (v1) * (v2_max) / (v1_max)
8143 + (SCALE(TASK_NICE(p), 40, MAX_USER_PRIO*PRIO_BONUS_RATIO/100) + \
8144 + INTERACTIVE_DELTA)
8146 +#define TASK_INTERACTIVE(p) \
8147 + ((p)->prio <= (p)->static_prio - DELTA(p))
8150 - * The tasklist_lock protects the linked list of processes.
8152 - * The runqueue_lock locks the parts that actually access
8153 - * and change the run-queues, and have to be interrupt-safe.
8155 - * If both locks are to be concurrently held, the runqueue_lock
8156 - * nests inside the tasklist_lock.
8157 + * TASK_TIMESLICE scales user-nice values [ -20 ... 19 ]
8158 + * to time slice values.
8160 - * task->alloc_lock nests inside tasklist_lock.
8161 + * The higher a process's priority, the bigger timeslices
8162 + * it gets during one round of execution. But even the lowest
8163 + * priority process gets MIN_TIMESLICE worth of execution time.
8165 -spinlock_t runqueue_lock __cacheline_aligned = SPIN_LOCK_UNLOCKED; /* inner */
8166 -rwlock_t tasklist_lock __cacheline_aligned = RW_LOCK_UNLOCKED; /* outer */
8168 -static LIST_HEAD(runqueue_head);
8169 +#define TASK_TIMESLICE(p) (MIN_TIMESLICE + \
8170 + ((MAX_TIMESLICE - MIN_TIMESLICE) * (MAX_PRIO-1-(p)->static_prio)/39))
8173 - * We align per-CPU scheduling data on cacheline boundaries,
8174 - * to prevent cacheline ping-pong.
8175 + * These are the runqueue data structures:
8178 - struct schedule_data {
8179 - struct task_struct * curr;
8180 - cycles_t last_schedule;
8182 - char __pad [SMP_CACHE_BYTES];
8183 -} aligned_data [NR_CPUS] __cacheline_aligned = { {{&init_task,0}}};
8185 -#define cpu_curr(cpu) aligned_data[(cpu)].schedule_data.curr
8186 -#define last_schedule(cpu) aligned_data[(cpu)].schedule_data.last_schedule
8187 +#define BITMAP_SIZE ((((MAX_PRIO+1+7)/8)+sizeof(long)-1)/sizeof(long))
8189 -struct kernel_stat kstat;
8190 -extern struct task_struct *child_reaper;
8191 +typedef struct runqueue runqueue_t;
8194 +struct prio_array {
8196 + unsigned long bitmap[BITMAP_SIZE];
8197 + list_t queue[MAX_PRIO];
8200 -#define idle_task(cpu) (init_tasks[cpu_number_map(cpu)])
8201 -#define can_schedule(p,cpu) \
8202 - ((p)->cpus_runnable & (p)->cpus_allowed & (1 << cpu))
8204 + * This is the main, per-CPU runqueue data structure.
8206 + * Locking rule: those places that want to lock multiple runqueues
8207 + * (such as the load balancing or the process migration code), lock
8208 + * acquire operations must be ordered by ascending &runqueue.
8212 + unsigned long nr_running, nr_switches, expired_timestamp;
8213 + task_t *curr, *idle;
8214 + prio_array_t *active, *expired, arrays[2];
8215 + long nr_uninterruptible;
8218 + int prev_nr_running[NR_CPUS];
8219 + task_t *migration_thread;
8220 + list_t migration_queue;
8222 +} ____cacheline_aligned;
8225 +static struct runqueue runqueues[NR_CPUS] __cacheline_aligned;
8227 -#define idle_task(cpu) (&init_task)
8228 -#define can_schedule(p,cpu) (1)
8229 +#define cpu_rq(cpu) (runqueues + (cpu))
8230 +#define this_rq() cpu_rq(smp_processor_id())
8231 +#define task_rq(p) cpu_rq((p)->cpu)
8232 +#define cpu_curr(cpu) (cpu_rq(cpu)->curr)
8233 +#define rt_task(p) ((p)->prio < MAX_RT_PRIO)
8236 + * Default context-switch locking:
8238 +#ifndef prepare_arch_switch
8239 +# define prepare_arch_switch(rq, next) do { } while(0)
8240 +# define finish_arch_switch(rq, prev) spin_unlock_irq(&(rq)->lock)
8243 -void scheduling_functions_start_here(void) { }
8246 - * This is the function that decides how desirable a process is..
8247 - * You can weigh different processes against each other depending
8248 - * on what CPU they've run on lately etc to try to handle cache
8249 - * and TLB miss penalties.
8252 - * -1000: never select this
8253 - * 0: out of time, recalculate counters (but it might still be
8255 - * +ve: "goodness" value (the larger, the better)
8256 - * +1000: realtime process, select this.
8257 + * task_rq_lock - lock the runqueue a given task resides on and disable
8258 + * interrupts. Note the ordering: we can safely lookup the task_rq without
8259 + * explicitly disabling preemption.
8262 -static inline int goodness(struct task_struct * p, int this_cpu, struct mm_struct *this_mm)
8263 +static inline runqueue_t *task_rq_lock(task_t *p, unsigned long *flags)
8268 - * select the current process after every other
8269 - * runnable process, but before the idle thread.
8270 - * Also, dont trigger a counter recalculation.
8273 - if (p->policy & SCHED_YIELD)
8277 - * Non-RT process - normal case first.
8279 - if (p->policy == SCHED_OTHER) {
8281 - * Give the process a first-approximation goodness value
8282 - * according to the number of clock-ticks it has left.
8284 - * Don't do any other calculations if the time slice is
8287 - weight = p->counter;
8292 - /* Give a largish advantage to the same processor... */
8293 - /* (this is equivalent to penalizing other processors) */
8294 - if (p->processor == this_cpu)
8295 - weight += PROC_CHANGE_PENALTY;
8297 + struct runqueue *rq;
8299 - /* .. and a slight advantage to the current MM */
8300 - if (p->mm == this_mm || !p->mm)
8302 - weight += 20 - p->nice;
8306 + spin_lock_irqsave(&rq->lock, *flags);
8307 + if (unlikely(rq != task_rq(p))) {
8308 + spin_unlock_irqrestore(&rq->lock, *flags);
8309 + goto repeat_lock_task;
8315 - * Realtime process, select the first one on the
8316 - * runqueue (taking priorities within processes
8319 - weight = 1000 + p->rt_priority;
8322 +static inline void task_rq_unlock(runqueue_t *rq, unsigned long *flags)
8324 + spin_unlock_irqrestore(&rq->lock, *flags);
8328 - * the 'goodness value' of replacing a process on a given CPU.
8329 - * positive value means 'replace', zero or negative means 'dont'.
8330 + * Adding/removing a task to/from a priority array:
8332 -static inline int preemption_goodness(struct task_struct * prev, struct task_struct * p, int cpu)
8333 +static inline void dequeue_task(struct task_struct *p, prio_array_t *array)
8335 - return goodness(p, cpu, prev->active_mm) - goodness(prev, cpu, prev->active_mm);
8336 + array->nr_active--;
8337 + list_del(&p->run_list);
8338 + if (list_empty(array->queue + p->prio))
8339 + __clear_bit(p->prio, array->bitmap);
8343 - * This is ugly, but reschedule_idle() is very timing-critical.
8344 - * We are called with the runqueue spinlock held and we must
8345 - * not claim the tasklist_lock.
8347 -static FASTCALL(void reschedule_idle(struct task_struct * p));
8348 +#define enqueue_task(p, array) __enqueue_task(p, array, NULL)
8349 +static inline void __enqueue_task(struct task_struct *p, prio_array_t *array, task_t * parent)
8352 + list_add_tail(&p->run_list, array->queue + p->prio);
8353 + __set_bit(p->prio, array->bitmap);
8356 + list_add_tail(&p->run_list, &parent->run_list);
8357 + array = p->array = parent->array;
8359 + array->nr_active++;
8362 -static void reschedule_idle(struct task_struct * p)
8363 +static inline int effective_prio(task_t *p)
8366 - int this_cpu = smp_processor_id();
8367 - struct task_struct *tsk, *target_tsk;
8368 - int cpu, best_cpu, i, max_prio;
8369 - cycles_t oldest_idle;
8373 - * shortcut if the woken up task's last CPU is
8375 + * Here we scale the actual sleep average [0 .... MAX_SLEEP_AVG]
8376 + * into the -5 ... 0 ... +5 bonus/penalty range.
8378 + * We use 25% of the full 0...39 priority range so that:
8380 + * 1) nice +19 interactive tasks do not preempt nice 0 CPU hogs.
8381 + * 2) nice -20 CPU hogs do not get preempted by nice 0 tasks.
8383 + * Both properties are important to certain workloads.
8385 - best_cpu = p->processor;
8386 - if (can_schedule(p, best_cpu)) {
8387 - tsk = idle_task(best_cpu);
8388 - if (cpu_curr(best_cpu) == tsk) {
8392 - * If need_resched == -1 then we can skip sending
8393 - * the IPI altogether, tsk->need_resched is
8394 - * actively watched by the idle thread.
8396 - need_resched = tsk->need_resched;
8397 - tsk->need_resched = 1;
8398 - if ((best_cpu != this_cpu) && !need_resched)
8399 - smp_send_reschedule(best_cpu);
8403 + bonus = MAX_USER_PRIO*PRIO_BONUS_RATIO*p->sleep_avg/MAX_SLEEP_AVG/100 -
8404 + MAX_USER_PRIO*PRIO_BONUS_RATIO/100/2;
8407 - * We know that the preferred CPU has a cache-affine current
8408 - * process, lets try to find a new idle CPU for the woken-up
8409 - * process. Select the least recently active idle CPU. (that
8410 - * one will have the least active cache context.) Also find
8411 - * the executing process which has the least priority.
8413 - oldest_idle = (cycles_t) -1;
8414 - target_tsk = NULL;
8416 + prio = p->static_prio - bonus;
8417 + if (prio < MAX_RT_PRIO)
8418 + prio = MAX_RT_PRIO;
8419 + if (prio > MAX_PRIO-1)
8420 + prio = MAX_PRIO-1;
8424 - for (i = 0; i < smp_num_cpus; i++) {
8425 - cpu = cpu_logical_map(i);
8426 - if (!can_schedule(p, cpu))
8428 - tsk = cpu_curr(cpu);
8429 +#define activate_task(p, rq) __activate_task(p, rq, NULL)
8430 +static inline void __activate_task(task_t *p, runqueue_t *rq, task_t * parent)
8432 + unsigned long sleep_time = jiffies - p->sleep_timestamp;
8433 + prio_array_t *array = rq->active;
8435 + if (!parent && !rt_task(p) && sleep_time) {
8437 - * We use the first available idle CPU. This creates
8438 - * a priority list between idle CPUs, but this is not
8440 + * This code gives a bonus to interactive tasks. We update
8441 + * an 'average sleep time' value here, based on
8442 + * sleep_timestamp. The more time a task spends sleeping,
8443 + * the higher the average gets - and the higher the priority
8444 + * boost gets as well.
8446 - if (tsk == idle_task(cpu)) {
8447 -#if defined(__i386__) && defined(CONFIG_SMP)
8449 - * Check if two siblings are idle in the same
8450 - * physical package. Use them if found.
8452 - if (smp_num_siblings == 2) {
8453 - if (cpu_curr(cpu_sibling_map[cpu]) ==
8454 - idle_task(cpu_sibling_map[cpu])) {
8455 - oldest_idle = last_schedule(cpu);
8462 - if (last_schedule(cpu) < oldest_idle) {
8463 - oldest_idle = last_schedule(cpu);
8467 - if (oldest_idle == -1ULL) {
8468 - int prio = preemption_goodness(tsk, p, cpu);
8470 - if (prio > max_prio) {
8479 - if (oldest_idle != -1ULL) {
8480 - best_cpu = tsk->processor;
8481 - goto send_now_idle;
8483 - tsk->need_resched = 1;
8484 - if (tsk->processor != this_cpu)
8485 - smp_send_reschedule(tsk->processor);
8486 + p->sleep_timestamp = jiffies;
8487 + p->sleep_avg += sleep_time;
8488 + if (p->sleep_avg > MAX_SLEEP_AVG)
8489 + p->sleep_avg = MAX_SLEEP_AVG;
8490 + p->prio = effective_prio(p);
8494 + __enqueue_task(p, array, parent);
8498 +static inline void deactivate_task(struct task_struct *p, runqueue_t *rq)
8501 + if (p->state == TASK_UNINTERRUPTIBLE)
8502 + rq->nr_uninterruptible++;
8503 + dequeue_task(p, p->array);
8508 - int this_cpu = smp_processor_id();
8509 - struct task_struct *tsk;
8510 +static inline void resched_task(task_t *p)
8515 - tsk = cpu_curr(this_cpu);
8516 - if (preemption_goodness(tsk, p, this_cpu) > 0)
8517 - tsk->need_resched = 1;
8518 + need_resched = p->need_resched;
8519 + set_tsk_need_resched(p);
8520 + if (!need_resched && (p->cpu != smp_processor_id()))
8521 + smp_send_reschedule(p->cpu);
8523 + set_tsk_need_resched(p);
8532 - * This has to add the process to the _end_ of the
8533 - * run-queue, not the beginning. The goodness value will
8534 - * determine whether this process will run next. This is
8535 - * important to get SCHED_FIFO and SCHED_RR right, where
8536 - * a process that is either pre-empted or its time slice
8537 - * has expired, should be moved to the tail of the run
8538 - * queue for its priority - Bhavesh Davda
8539 + * Wait for a process to unschedule. This is used by the exit() and
8542 -static inline void add_to_runqueue(struct task_struct * p)
8543 +void wait_task_inactive(task_t * p)
8545 - list_add_tail(&p->run_list, &runqueue_head);
8547 + unsigned long flags;
8552 + if (unlikely(rq->curr == p)) {
8557 + rq = task_rq_lock(p, &flags);
8558 + if (unlikely(rq->curr == p)) {
8559 + task_rq_unlock(rq, &flags);
8562 + task_rq_unlock(rq, &flags);
8565 -static inline void move_last_runqueue(struct task_struct * p)
8567 + * Kick the remote CPU if the task is running currently,
8568 + * this code is used by the signal code to signal tasks
8569 + * which are in user-mode as quickly as possible.
8571 + * (Note that we do this lockless - if the task does anything
8572 + * while the message is in flight then it will notice the
8573 + * sigpending condition anyway.)
8575 +void kick_if_running(task_t * p)
8577 - list_del(&p->run_list);
8578 - list_add_tail(&p->run_list, &runqueue_head);
8579 + if (p == task_rq(p)->curr && p->cpu != smp_processor_id())
8585 +static int FASTCALL(reschedule_idle(task_t * p));
8586 +static void FASTCALL(load_balance(runqueue_t *this_rq, int idle));
8591 * Wake up a process. Put it on the run-queue if it's not
8592 @@ -345,429 +338,721 @@
8593 * progress), and as such you're allowed to do the simpler
8594 * "current->state = TASK_RUNNING" to mark yourself runnable
8595 * without the overhead of this.
8597 + * returns failure only if the task is already active.
8599 -static inline int try_to_wake_up(struct task_struct * p, int synchronous)
8600 +static int try_to_wake_up(task_t * p, int sync)
8602 unsigned long flags;
8607 + int migrated_to_idle = 0;
8613 + rq = task_rq_lock(p, &flags);
8614 + old_state = p->state;
8617 + if (likely(rq->curr != p)) {
8619 + if (unlikely(sync)) {
8620 + if (p->cpu != smp_processor_id() &&
8621 + p->cpus_allowed & (1UL << smp_processor_id())) {
8622 + p->cpu = smp_processor_id();
8623 + goto migrated_task;
8626 + if (reschedule_idle(p))
8627 + goto migrated_task;
8631 + if (old_state == TASK_UNINTERRUPTIBLE)
8632 + rq->nr_uninterruptible--;
8633 + activate_task(p, rq);
8634 + if (p->prio < rq->curr->prio)
8635 + resched_task(rq->curr);
8638 + p->state = TASK_RUNNING;
8642 - * We want the common case fall through straight, thus the goto.
8643 + * Subtle: we can load_balance only here (before unlock)
8644 + * because it can internally drop the lock. Claim
8645 + * that the cpu is running so it will be a light rebalance,
8646 + * if this cpu will go idle soon schedule() will trigger the
8647 + * idle rescheduling balancing by itself.
8649 - spin_lock_irqsave(&runqueue_lock, flags);
8650 - p->state = TASK_RUNNING;
8651 - if (task_on_runqueue(p))
8653 - add_to_runqueue(p);
8654 - if (!synchronous || !(p->cpus_allowed & (1 << smp_processor_id())))
8655 - reschedule_idle(p);
8658 - spin_unlock_irqrestore(&runqueue_lock, flags);
8659 + if (success && migrated_to_idle)
8660 + load_balance(rq, 0);
8663 + task_rq_unlock(rq, &flags);
8669 + task_rq_unlock(rq, &flags);
8670 + migrated_to_idle = 1;
8671 + goto repeat_lock_task;
8675 -inline int wake_up_process(struct task_struct * p)
8676 +int wake_up_process(task_t * p)
8678 return try_to_wake_up(p, 0);
8681 -static void process_timeout(unsigned long __data)
8682 +void wake_up_forked_process(task_t * p)
8684 - struct task_struct * p = (struct task_struct *) __data;
8686 + task_t * parent = current;
8688 - wake_up_process(p);
8691 + spin_lock_irq(&rq->lock);
8694 - * schedule_timeout - sleep until timeout
8695 - * @timeout: timeout value in jiffies
8697 - * Make the current task sleep until @timeout jiffies have
8698 - * elapsed. The routine will return immediately unless
8699 - * the current task state has been set (see set_current_state()).
8701 - * You can set the task state as follows -
8703 - * %TASK_UNINTERRUPTIBLE - at least @timeout jiffies are guaranteed to
8704 - * pass before the routine returns. The routine will return 0
8706 - * %TASK_INTERRUPTIBLE - the routine may return early if a signal is
8707 - * delivered to the current task. In this case the remaining time
8708 - * in jiffies will be returned, or 0 if the timer expired in time
8710 - * The current task state is guaranteed to be TASK_RUNNING when this
8711 - * routine returns.
8713 - * Specifying a @timeout value of %MAX_SCHEDULE_TIMEOUT will schedule
8714 - * the CPU away without a bound on the timeout. In this case the return
8715 - * value will be %MAX_SCHEDULE_TIMEOUT.
8717 - * In all cases the return value is guaranteed to be non-negative.
8719 -signed long schedule_timeout(signed long timeout)
8721 - struct timer_list timer;
8722 - unsigned long expire;
8723 + p->state = TASK_RUNNING;
8724 + if (likely(!rt_task(p) && parent->array)) {
8726 + * We decrease the sleep average of forked
8727 + * children, to keep max-interactive tasks
8728 + * from forking tasks that are max-interactive.
8729 + * CHILD_PENALTY is set to 50% since we have
8730 + * no clue if this is still an interactive
8731 + * task like the parent or if this will be a
8732 + * cpu bound task. The parent isn't touched
8733 + * as we don't make assumption about the parent
8734 + * changing behaviour after the child is forked.
8736 + parent->sleep_avg = parent->sleep_avg * PARENT_PENALTY / 100;
8737 + p->sleep_avg = p->sleep_avg * CHILD_PENALTY / 100;
8741 - case MAX_SCHEDULE_TIMEOUT:
8743 - * These two special cases are useful to be comfortable
8744 - * in the caller. Nothing more. We could take
8745 - * MAX_SCHEDULE_TIMEOUT from one of the negative value
8746 - * but I' d like to return a valid offset (>=0) to allow
8747 - * the caller to do everything it want with the retval.
8748 + * For its first schedule keep the child at the same
8749 + * priority (i.e. in the same list) of the parent,
8750 + * activate_forked_task() will take care to put the
8751 + * child in front of the parent (lifo) to guarantee a
8752 + * schedule-child-first behaviour after fork.
8757 + p->prio = parent->prio;
8760 - * Another bit of PARANOID. Note that the retval will be
8761 - * 0 since no piece of kernel is supposed to do a check
8762 - * for a negative retval of schedule_timeout() (since it
8763 - * should never happens anyway). You just have the printk()
8764 - * that will tell you if something is gone wrong and where.
8765 + * Take the usual wakeup path if it's RT or if
8766 + * it's a child of the first idle task (during boot
8771 - printk(KERN_ERR "schedule_timeout: wrong timeout "
8772 - "value %lx from %p\n", timeout,
8773 - __builtin_return_address(0));
8774 - current->state = TASK_RUNNING;
8777 + p->prio = effective_prio(p);
8781 - expire = timeout + jiffies;
8782 + p->cpu = smp_processor_id();
8783 + __activate_task(p, rq, parent);
8784 + spin_unlock_irq(&rq->lock);
8787 - init_timer(&timer);
8788 - timer.expires = expire;
8789 - timer.data = (unsigned long) current;
8790 - timer.function = process_timeout;
8792 + * Potentially available exiting-child timeslices are
8793 + * retrieved here - this way the parent does not get
8794 + * penalized for creating too many processes.
8796 + * (this cannot be used to 'generate' timeslices
8797 + * artificially, because any timeslice recovered here
8798 + * was given away by the parent in the first place.)
8800 +void sched_exit(task_t * p)
8803 + if (p->first_time_slice) {
8804 + current->time_slice += p->time_slice;
8805 + if (unlikely(current->time_slice > MAX_TIMESLICE))
8806 + current->time_slice = MAX_TIMESLICE;
8811 - add_timer(&timer);
8813 - del_timer_sync(&timer);
8815 +asmlinkage void schedule_tail(task_t *prev)
8817 + finish_arch_switch(this_rq(), prev);
8821 +static inline task_t * context_switch(task_t *prev, task_t *next)
8823 + struct mm_struct *mm = next->mm;
8824 + struct mm_struct *oldmm = prev->active_mm;
8826 + if (unlikely(!mm)) {
8827 + next->active_mm = oldmm;
8828 + atomic_inc(&oldmm->mm_count);
8829 + enter_lazy_tlb(oldmm, next, smp_processor_id());
8831 + switch_mm(oldmm, mm, next, smp_processor_id());
8833 + if (unlikely(!prev->mm)) {
8834 + prev->active_mm = NULL;
8838 - timeout = expire - jiffies;
8839 + /* Here we just switch the register state and the stack. */
8840 + switch_to(prev, next, prev);
8843 - return timeout < 0 ? 0 : timeout;
8848 - * schedule_tail() is getting called from the fork return path. This
8849 - * cleans up all remaining scheduler things, without impacting the
8852 -static inline void __schedule_tail(struct task_struct *prev)
8853 +unsigned long nr_running(void)
8857 + unsigned long i, sum = 0;
8860 - * prev->policy can be written from here only before `prev'
8861 - * can be scheduled (before setting prev->cpus_runnable to ~0UL).
8862 - * Of course it must also be read before allowing prev
8863 - * to be rescheduled, but since the write depends on the read
8864 - * to complete, wmb() is enough. (the spin_lock() acquired
8865 - * before setting cpus_runnable is not enough because the spin_lock()
8866 - * common code semantics allows code outside the critical section
8867 - * to enter inside the critical section)
8869 - policy = prev->policy;
8870 - prev->policy = policy & ~SCHED_YIELD;
8872 + for (i = 0; i < smp_num_cpus; i++)
8873 + sum += cpu_rq(cpu_logical_map(i))->nr_running;
8876 - * fast path falls through. We have to clear cpus_runnable before
8877 - * checking prev->state to avoid a wakeup race. Protect against
8878 - * the task exiting early.
8881 - task_release_cpu(prev);
8883 - if (prev->state == TASK_RUNNING)
8884 - goto needs_resched;
8889 - task_unlock(prev); /* Synchronise here with release_task() if prev is TASK_ZOMBIE */
8891 +/* Note: the per-cpu information is useful only to get the cumulative result */
8892 +unsigned long nr_uninterruptible(void)
8894 + unsigned long i, sum = 0;
8897 - * Slow path - we 'push' the previous process and
8898 - * reschedule_idle() will attempt to find a new
8899 - * processor for it. (but it might preempt the
8900 - * current process as well.) We must take the runqueue
8901 - * lock and re-check prev->state to be correct. It might
8902 - * still happen that this process has a preemption
8903 - * 'in progress' already - but this is not a problem and
8904 - * might happen in other circumstances as well.
8908 - unsigned long flags;
8909 + for (i = 0; i < smp_num_cpus; i++)
8910 + sum += cpu_rq(cpu_logical_map(i))->nr_uninterruptible;
8913 - * Avoid taking the runqueue lock in cases where
8914 - * no preemption-check is necessery:
8916 - if ((prev == idle_task(smp_processor_id())) ||
8917 - (policy & SCHED_YIELD))
8922 - spin_lock_irqsave(&runqueue_lock, flags);
8923 - if ((prev->state == TASK_RUNNING) && !task_has_cpu(prev))
8924 - reschedule_idle(prev);
8925 - spin_unlock_irqrestore(&runqueue_lock, flags);
8929 - prev->policy &= ~SCHED_YIELD;
8930 -#endif /* CONFIG_SMP */
8931 +unsigned long nr_context_switches(void)
8933 + unsigned long i, sum = 0;
8935 + for (i = 0; i < smp_num_cpus; i++)
8936 + sum += cpu_rq(cpu_logical_map(i))->nr_switches;
8941 -asmlinkage void schedule_tail(struct task_struct *prev)
8942 +inline int idle_cpu(int cpu)
8944 - __schedule_tail(prev);
8945 + return cpu_curr(cpu) == cpu_rq(cpu)->idle;
8950 - * 'schedule()' is the scheduler function. It's a very simple and nice
8951 - * scheduler: it's not perfect, but certainly works for most things.
8953 - * The goto is "interesting".
8955 - * NOTE!! Task 0 is the 'idle' task, which gets called when no other
8956 - * tasks can run. It can not be killed, and it cannot sleep. The 'state'
8957 - * information in task[0] is never used.
8958 + * Lock the busiest runqueue as well, this_rq is locked already.
8959 + * Recalculate nr_running if we have to drop the runqueue lock.
8961 -asmlinkage void schedule(void)
8962 +static inline unsigned int double_lock_balance(runqueue_t *this_rq,
8963 + runqueue_t *busiest, int this_cpu, int idle, unsigned int nr_running)
8965 - struct schedule_data * sched_data;
8966 - struct task_struct *prev, *next, *p;
8967 - struct list_head *tmp;
8969 + if (unlikely(!spin_trylock(&busiest->lock))) {
8970 + if (busiest < this_rq) {
8971 + spin_unlock(&this_rq->lock);
8972 + spin_lock(&busiest->lock);
8973 + spin_lock(&this_rq->lock);
8974 + /* Need to recalculate nr_running */
8975 + if (idle || (this_rq->nr_running > this_rq->prev_nr_running[this_cpu]))
8976 + nr_running = this_rq->nr_running;
8978 + nr_running = this_rq->prev_nr_running[this_cpu];
8980 + spin_lock(&busiest->lock);
8982 + return nr_running;
8986 + * Move a task from a remote runqueue to the local runqueue.
8987 + * Both runqueues must be locked.
8989 +static inline int pull_task(runqueue_t *src_rq, prio_array_t *src_array, task_t *p, runqueue_t *this_rq, int this_cpu)
8993 - spin_lock_prefetch(&runqueue_lock);
8994 + dequeue_task(p, src_array);
8995 + src_rq->nr_running--;
8996 + p->cpu = this_cpu;
8997 + this_rq->nr_running++;
8998 + enqueue_task(p, this_rq->active);
9000 + * Note that idle threads have a prio of MAX_PRIO, for this test
9001 + * to be always true for them.
9003 + if (p->prio < this_rq->curr->prio)
9006 - BUG_ON(!current->active_mm);
9009 - this_cpu = prev->processor;
9013 - if (unlikely(in_interrupt())) {
9014 - printk("Scheduling in interrupt\n");
9016 +static inline int idle_cpu_reschedule(task_t * p, int cpu)
9018 + if (unlikely(!(p->cpus_allowed & (1UL << cpu))))
9020 + return idle_cpu(cpu);
9023 +#include <linux/smp_balance.h>
9025 +static int reschedule_idle(task_t * p)
9027 + int p_cpu = p->cpu, i;
9029 + if (idle_cpu(p_cpu))
9032 + p_cpu = cpu_number_map(p_cpu);
9034 + for (i = (p_cpu + 1) % smp_num_cpus;
9036 + i = (i + 1) % smp_num_cpus) {
9037 + int physical = cpu_logical_map(i);
9039 + if (idle_cpu_reschedule(p, physical)) {
9040 + physical = arch_reschedule_idle_override(p, physical);
9041 + p->cpu = physical;
9046 - release_kernel_lock(prev, this_cpu);
9051 + * Current runqueue is empty, or rebalance tick: if there is an
9052 + * inbalance (current runqueue is too short) then pull from
9053 + * busiest runqueue(s).
9055 + * We call this with the current runqueue locked,
9058 +static void load_balance(runqueue_t *this_rq, int idle)
9060 + int imbalance, nr_running, load, max_load,
9061 + idx, i, this_cpu = this_rq - runqueues;
9063 + runqueue_t *busiest, *rq_src;
9064 + prio_array_t *array;
9065 + list_t *head, *curr;
9069 - * 'sched_data' is protected by the fact that we can run
9070 - * only one process per CPU.
9071 + * Handle architecture-specific balancing, such as hyperthreading.
9073 - sched_data = & aligned_data[this_cpu].schedule_data;
9074 + if (arch_load_balance(this_cpu, idle))
9077 - spin_lock_irq(&runqueue_lock);
9080 + * We search all runqueues to find the most busy one.
9081 + * We do this lockless to reduce cache-bouncing overhead,
9082 + * we re-check the 'best' source CPU later on again, with
9085 + * We fend off statistical fluctuations in runqueue lengths by
9086 + * saving the runqueue length during the previous load-balancing
9087 + * operation and using the smaller one the current and saved lengths.
9088 + * If a runqueue is long enough for a longer amount of time then
9089 + * we recognize it and pull tasks from it.
9091 + * The 'current runqueue length' is a statistical maximum variable,
9092 + * for that one we take the longer one - to avoid fluctuations in
9093 + * the other direction. So for a load-balance to happen it needs
9094 + * stable long runqueue on the target CPU and stable short runqueue
9095 + * on the local runqueue.
9097 + * We make an exception if this CPU is about to become idle - in
9098 + * that case we are less picky about moving a task across CPUs and
9099 + * take what can be taken.
9101 + if (idle || (this_rq->nr_running > this_rq->prev_nr_running[this_cpu]))
9102 + nr_running = this_rq->nr_running;
9104 + nr_running = this_rq->prev_nr_running[this_cpu];
9106 - /* move an exhausted RR process to be last.. */
9107 - if (unlikely(prev->policy == SCHED_RR))
9108 - if (!prev->counter) {
9109 - prev->counter = NICE_TO_TICKS(prev->nice);
9110 - move_last_runqueue(prev);
9114 + for (i = 0; i < smp_num_cpus; i++) {
9115 + int logical = cpu_logical_map(i);
9117 - switch (prev->state) {
9118 - case TASK_INTERRUPTIBLE:
9119 - if (signal_pending(prev)) {
9120 - prev->state = TASK_RUNNING;
9124 - del_from_runqueue(prev);
9125 - case TASK_RUNNING:;
9126 + rq_src = cpu_rq(logical);
9127 + if (idle || (rq_src->nr_running < this_rq->prev_nr_running[logical]))
9128 + load = rq_src->nr_running;
9130 + load = this_rq->prev_nr_running[logical];
9131 + this_rq->prev_nr_running[logical] = rq_src->nr_running;
9133 + if ((load > max_load) && (rq_src != this_rq)) {
9138 - prev->need_resched = 0;
9140 + if (likely(!busiest))
9143 + imbalance = (max_load - nr_running) / 2;
9145 + /* It needs an at least ~25% imbalance to trigger balancing. */
9146 + if (!idle && (imbalance < (max_load + 3)/4))
9150 - * this is the scheduler proper:
9151 + * Make sure nothing significant changed since we checked the
9152 + * runqueue length.
9154 + if (double_lock_balance(this_rq, busiest, this_cpu, idle, nr_running) > nr_running ||
9155 + busiest->nr_running < max_load)
9156 + goto out_unlock_retry;
9160 - * Default process to select..
9161 + * We first consider expired tasks. Those will likely not be
9162 + * executed in the near future, and they are most likely to
9163 + * be cache-cold, thus switching CPUs has the least effect
9166 - next = idle_task(this_cpu);
9168 - list_for_each(tmp, &runqueue_head) {
9169 - p = list_entry(tmp, struct task_struct, run_list);
9170 - if (can_schedule(p, this_cpu)) {
9171 - int weight = goodness(p, this_cpu, prev->active_mm);
9173 - c = weight, next = p;
9174 + if (busiest->expired->nr_active)
9175 + array = busiest->expired;
9177 + array = busiest->active;
9181 + /* Start searching at priority 0: */
9185 + idx = sched_find_first_bit(array->bitmap);
9187 + idx = find_next_bit(array->bitmap, MAX_PRIO, idx);
9188 + if (idx == MAX_PRIO) {
9189 + if (array == busiest->expired) {
9190 + array = busiest->active;
9196 - /* Do we need to re-calculate counters? */
9197 - if (unlikely(!c)) {
9198 - struct task_struct *p;
9200 - spin_unlock_irq(&runqueue_lock);
9201 - read_lock(&tasklist_lock);
9203 - p->counter = (p->counter >> 1) + NICE_TO_TICKS(p->nice);
9204 - read_unlock(&tasklist_lock);
9205 - spin_lock_irq(&runqueue_lock);
9206 - goto repeat_schedule;
9207 + head = array->queue + idx;
9208 + curr = head->prev;
9210 + tmp = list_entry(curr, task_t, run_list);
9213 + * We do not migrate tasks that are:
9214 + * 1) running (obviously), or
9215 + * 2) cannot be migrated to this CPU due to cpus_allowed, or
9216 + * 3) are cache-hot on their current CPU.
9219 +#define CAN_MIGRATE_TASK(p,rq,this_cpu) \
9220 + ((jiffies - (p)->sleep_timestamp > cache_decay_ticks) && \
9221 + ((p) != (rq)->curr) && \
9222 + ((p)->cpus_allowed & (1UL << (this_cpu))))
9224 + curr = curr->prev;
9226 + if (!CAN_MIGRATE_TASK(tmp, busiest, this_cpu)) {
9232 + resched |= pull_task(busiest, array, tmp, this_rq, this_cpu);
9233 + if (--imbalance > 0) {
9240 + spin_unlock(&busiest->lock);
9242 + resched_task(this_rq->curr);
9245 + spin_unlock(&busiest->lock);
9250 - * from this point on nothing can prevent us from
9251 - * switching to the next task, save this fact in
9254 - sched_data->curr = next;
9255 - task_set_cpu(next, this_cpu);
9256 - spin_unlock_irq(&runqueue_lock);
9258 - if (unlikely(prev == next)) {
9259 - /* We won't go through the normal tail, so do this by hand */
9260 - prev->policy &= ~SCHED_YIELD;
9261 - goto same_process;
9263 + * One of the idle_cpu_tick() or the busy_cpu_tick() function will
9264 + * gets called every timer tick, on every CPU. Our balancing action
9265 + * frequency and balancing agressivity depends on whether the CPU is
9268 + * busy-rebalance every 250 msecs. idle-rebalance every 100 msec.
9270 +#define BUSY_REBALANCE_TICK (HZ/4 ?: 1)
9271 +#define IDLE_REBALANCE_TICK (HZ/10 ?: 1)
9273 +static inline void idle_tick(void)
9275 + if (unlikely(time_before_eq(this_rq()->last_jiffy + IDLE_REBALANCE_TICK, jiffies))) {
9276 + spin_lock(&this_rq()->lock);
9277 + load_balance(this_rq(), 1);
9278 + spin_unlock(&this_rq()->lock);
9279 + this_rq()->last_jiffy = jiffies;
9285 - * maintain the per-process 'last schedule' value.
9286 - * (this has to be recalculated even if we reschedule to
9287 - * the same process) Currently this is only used on SMP,
9288 - * and it's approximate, so we do not have to maintain
9289 - * it while holding the runqueue spinlock.
9291 - sched_data->last_schedule = get_cycles();
9295 - * We drop the scheduler lock early (it's a global spinlock),
9296 - * thus we have to lock the previous process from getting
9297 - * rescheduled during switch_to().
9300 + * We place interactive tasks back into the active array, if possible.
9302 + * To guarantee that this does not starve expired tasks we ignore the
9303 + * interactivity of a task if the first expired task had to wait more
9304 + * than a 'reasonable' amount of time. This deadline timeout is
9305 + * load-dependent, as the frequency of array switched decreases with
9306 + * increasing number of running tasks:
9308 +#define EXPIRED_STARVING(rq) \
9309 + ((rq)->expired_timestamp && \
9310 + (jiffies - (rq)->expired_timestamp >= \
9311 + STARVATION_LIMIT * ((rq)->nr_running) + 1))
9313 -#endif /* CONFIG_SMP */
9315 + * This function gets called by the timer code, with HZ frequency.
9316 + * We call it with interrupts disabled.
9318 +void scheduler_tick(int user_tick, int system)
9320 + int cpu = smp_processor_id();
9321 + runqueue_t *rq = this_rq();
9322 + task_t *p = current;
9324 - kstat.context_swtch++;
9326 - * there are 3 processes which are affected by a context switch:
9328 - * prev == .... ==> (last => next)
9330 - * It's the 'much more previous' 'prev' that is on next's stack,
9331 - * but prev is set to (the just run) 'last' process by switch_to().
9332 - * This might sound slightly confusing but makes tons of sense.
9334 - prepare_to_switch();
9336 - struct mm_struct *mm = next->mm;
9337 - struct mm_struct *oldmm = prev->active_mm;
9339 - BUG_ON(next->active_mm);
9340 - next->active_mm = oldmm;
9341 - atomic_inc(&oldmm->mm_count);
9342 - enter_lazy_tlb(oldmm, next, this_cpu);
9344 - BUG_ON(next->active_mm != mm);
9345 - switch_mm(oldmm, mm, next, this_cpu);
9346 + if (p == rq->idle) {
9347 + if (local_bh_count(cpu) || local_irq_count(cpu) > 1)
9348 + kstat.per_cpu_system[cpu] += system;
9354 + if (TASK_NICE(p) > 0)
9355 + kstat.per_cpu_nice[cpu] += user_tick;
9357 + kstat.per_cpu_user[cpu] += user_tick;
9358 + kstat.per_cpu_system[cpu] += system;
9360 + /* Task might have expired already, but not scheduled off yet */
9361 + if (p->array != rq->active) {
9362 + set_tsk_need_resched(p);
9365 + spin_lock(&rq->lock);
9366 + if (unlikely(rt_task(p))) {
9368 + * RR tasks need a special form of timeslice management.
9369 + * FIFO tasks have no timeslices.
9371 + if ((p->policy == SCHED_RR) && !--p->time_slice) {
9372 + p->time_slice = TASK_TIMESLICE(p);
9373 + p->first_time_slice = 0;
9374 + set_tsk_need_resched(p);
9376 + /* put it at the end of the queue: */
9377 + dequeue_task(p, rq->active);
9378 + enqueue_task(p, rq->active);
9383 + * The task was running during this tick - update the
9384 + * time slice counter and the sleep average. Note: we
9385 + * do not update a process's priority until it either
9386 + * goes to sleep or uses up its timeslice. This makes
9387 + * it possible for interactive tasks to use up their
9388 + * timeslices at their highest priority levels.
9392 + if (!--p->time_slice) {
9393 + dequeue_task(p, rq->active);
9394 + set_tsk_need_resched(p);
9395 + p->prio = effective_prio(p);
9396 + p->time_slice = TASK_TIMESLICE(p);
9397 + p->first_time_slice = 0;
9399 + if (!TASK_INTERACTIVE(p) || EXPIRED_STARVING(rq)) {
9400 + if (!rq->expired_timestamp)
9401 + rq->expired_timestamp = jiffies;
9402 + enqueue_task(p, rq->expired);
9404 + enqueue_task(p, rq->active);
9408 + if (unlikely(time_before_eq(this_rq()->last_jiffy + BUSY_REBALANCE_TICK, jiffies))) {
9409 + load_balance(rq, 0);
9410 + rq->last_jiffy = jiffies;
9413 + spin_unlock(&rq->lock);
9416 +void scheduling_functions_start_here(void) { }
9419 + * 'schedule()' is the main scheduler function.
9421 +asmlinkage void schedule(void)
9423 + task_t *prev, *next;
9425 + prio_array_t *array;
9429 + if (unlikely(in_interrupt()))
9433 - prev->active_mm = NULL;
9439 + release_kernel_lock(prev, smp_processor_id());
9440 + prev->sleep_timestamp = jiffies;
9441 + spin_lock_irq(&rq->lock);
9443 + switch (prev->state) {
9444 + case TASK_INTERRUPTIBLE:
9445 + if (unlikely(signal_pending(prev))) {
9446 + prev->state = TASK_RUNNING;
9450 + deactivate_task(prev, rq);
9451 + case TASK_RUNNING:
9457 + if (unlikely(!rq->nr_running)) {
9459 + load_balance(rq, 2);
9460 + rq->last_jiffy = jiffies;
9461 + if (rq->nr_running)
9462 + goto pick_next_task;
9465 + rq->expired_timestamp = 0;
9466 + goto switch_tasks;
9470 - * This just switches the register state and the
9473 - switch_to(prev, next, prev);
9474 - __schedule_tail(prev);
9475 + array = rq->active;
9476 + if (unlikely(!array->nr_active)) {
9478 + * Switch the active and expired arrays.
9480 + rq->active = rq->expired;
9481 + rq->expired = array;
9482 + array = rq->active;
9483 + rq->expired_timestamp = 0;
9486 + idx = sched_find_first_bit(array->bitmap);
9487 + queue = array->queue + idx;
9488 + next = list_entry(queue->next, task_t, run_list);
9492 + clear_tsk_need_resched(prev);
9494 + if (likely(prev != next)) {
9495 + rq->nr_switches++;
9498 + prepare_arch_switch(rq, next);
9499 + prev = context_switch(prev, next);
9502 + finish_arch_switch(rq, prev);
9504 + spin_unlock_irq(&rq->lock);
9507 reacquire_kernel_lock(current);
9508 - if (current->need_resched)
9509 - goto need_resched_back;
9511 + if (need_resched())
9512 + goto need_resched;
9516 - * The core wakeup function. Non-exclusive wakeups (nr_exclusive == 0) just wake everything
9517 - * up. If it's an exclusive wakeup (nr_exclusive == small +ve number) then we wake all the
9518 - * non-exclusive tasks and one exclusive task.
9519 + * The core wakeup function. Non-exclusive wakeups (nr_exclusive == 0) just
9520 + * wake everything up. If it's an exclusive wakeup (nr_exclusive == small +ve
9521 + * number) then we wake all the non-exclusive tasks and one exclusive task.
9523 * There are circumstances in which we can try to wake a task which has already
9524 - * started to run but is not in state TASK_RUNNING. try_to_wake_up() returns zero
9525 - * in this (rare) case, and we handle it by contonuing to scan the queue.
9526 + * started to run but is not in state TASK_RUNNING. try_to_wake_up() returns
9527 + * zero in this (rare) case, and we handle it by continuing to scan the queue.
9529 -static inline void __wake_up_common (wait_queue_head_t *q, unsigned int mode,
9530 - int nr_exclusive, const int sync)
9531 +static inline void __wake_up_common(wait_queue_head_t *q, unsigned int mode, int nr_exclusive, int sync)
9533 struct list_head *tmp;
9534 - struct task_struct *p;
9536 - CHECK_MAGIC_WQHEAD(q);
9537 - WQ_CHECK_LIST_HEAD(&q->task_list);
9539 - list_for_each(tmp,&q->task_list) {
9540 - unsigned int state;
9541 - wait_queue_t *curr = list_entry(tmp, wait_queue_t, task_list);
9542 + unsigned int state;
9543 + wait_queue_t *curr;
9546 - CHECK_MAGIC(curr->__magic);
9547 + list_for_each(tmp, &q->task_list) {
9548 + curr = list_entry(tmp, wait_queue_t, task_list);
9551 - if (state & mode) {
9552 - WQ_NOTE_WAKER(curr);
9553 - if (try_to_wake_up(p, sync) && (curr->flags&WQ_FLAG_EXCLUSIVE) && !--nr_exclusive)
9554 + if ((state & mode) && try_to_wake_up(p, sync) &&
9555 + ((curr->flags & WQ_FLAG_EXCLUSIVE) && !--nr_exclusive))
9561 -void __wake_up(wait_queue_head_t *q, unsigned int mode, int nr)
9562 +void __wake_up(wait_queue_head_t *q, unsigned int mode, int nr_exclusive)
9565 - unsigned long flags;
9566 - wq_read_lock_irqsave(&q->lock, flags);
9567 - __wake_up_common(q, mode, nr, 0);
9568 - wq_read_unlock_irqrestore(&q->lock, flags);
9570 + unsigned long flags;
9575 + wq_read_lock_irqsave(&q->lock, flags);
9576 + __wake_up_common(q, mode, nr_exclusive, 0);
9577 + wq_read_unlock_irqrestore(&q->lock, flags);
9580 -void __wake_up_sync(wait_queue_head_t *q, unsigned int mode, int nr)
9583 +void __wake_up_sync(wait_queue_head_t *q, unsigned int mode, int nr_exclusive)
9586 - unsigned long flags;
9587 - wq_read_lock_irqsave(&q->lock, flags);
9588 - __wake_up_common(q, mode, nr, 1);
9589 - wq_read_unlock_irqrestore(&q->lock, flags);
9591 + unsigned long flags;
9596 + wq_read_lock_irqsave(&q->lock, flags);
9597 + if (likely(nr_exclusive))
9598 + __wake_up_common(q, mode, nr_exclusive, 1);
9600 + __wake_up_common(q, mode, nr_exclusive, 0);
9601 + wq_read_unlock_irqrestore(&q->lock, flags);
9606 void complete(struct completion *x)
9608 unsigned long flags;
9610 - spin_lock_irqsave(&x->wait.lock, flags);
9611 + wq_write_lock_irqsave(&x->wait.lock, flags);
9613 __wake_up_common(&x->wait, TASK_UNINTERRUPTIBLE | TASK_INTERRUPTIBLE, 1, 0);
9614 - spin_unlock_irqrestore(&x->wait.lock, flags);
9615 + wq_write_unlock_irqrestore(&x->wait.lock, flags);
9618 void wait_for_completion(struct completion *x)
9620 - spin_lock_irq(&x->wait.lock);
9621 + wq_write_lock_irq(&x->wait.lock);
9623 DECLARE_WAITQUEUE(wait, current);
9625 @@ -775,14 +1060,14 @@
9626 __add_wait_queue_tail(&x->wait, &wait);
9628 __set_current_state(TASK_UNINTERRUPTIBLE);
9629 - spin_unlock_irq(&x->wait.lock);
9630 + wq_write_unlock_irq(&x->wait.lock);
9632 - spin_lock_irq(&x->wait.lock);
9633 + wq_write_lock_irq(&x->wait.lock);
9635 __remove_wait_queue(&x->wait, &wait);
9638 - spin_unlock_irq(&x->wait.lock);
9639 + wq_write_unlock_irq(&x->wait.lock);
9642 #define SLEEP_ON_VAR \
9643 @@ -850,6 +1135,41 @@
9645 void scheduling_functions_end_here(void) { }
9647 +void set_user_nice(task_t *p, long nice)
9649 + unsigned long flags;
9650 + prio_array_t *array;
9653 + if (TASK_NICE(p) == nice || nice < -20 || nice > 19)
9656 + * We have to be careful, if called from sys_setpriority(),
9657 + * the task might be in the middle of scheduling on another CPU.
9659 + rq = task_rq_lock(p, &flags);
9661 + p->static_prio = NICE_TO_PRIO(nice);
9666 + dequeue_task(p, array);
9667 + p->static_prio = NICE_TO_PRIO(nice);
9668 + p->prio = NICE_TO_PRIO(nice);
9670 + enqueue_task(p, array);
9672 + * If the task is running and lowered its priority,
9673 + * or increased its priority then reschedule its CPU:
9675 + if (p == rq->curr)
9676 + resched_task(rq->curr);
9679 + task_rq_unlock(rq, &flags);
9685 @@ -860,7 +1180,7 @@
9687 asmlinkage long sys_nice(int increment)
9693 * Setpriority might change our priority at the same moment.
9694 @@ -876,32 +1196,46 @@
9698 - newprio = current->nice + increment;
9699 - if (newprio < -20)
9703 - current->nice = newprio;
9704 + nice = PRIO_TO_NICE(current->static_prio) + increment;
9709 + set_user_nice(current, nice);
9715 -static inline struct task_struct *find_process_by_pid(pid_t pid)
9717 + * This is the priority value as seen by users in /proc
9719 + * RT tasks are offset by -200. Normal tasks are centered
9720 + * around 0, value goes from -16 to +15.
9722 +int task_prio(task_t *p)
9724 - struct task_struct *tsk = current;
9725 + return p->prio - MAX_USER_RT_PRIO;
9729 - tsk = find_task_by_pid(pid);
9731 +int task_nice(task_t *p)
9733 + return TASK_NICE(p);
9736 -static int setscheduler(pid_t pid, int policy,
9737 - struct sched_param *param)
9738 +static inline task_t *find_process_by_pid(pid_t pid)
9740 + return pid ? find_task_by_pid(pid) : current;
9743 +static int setscheduler(pid_t pid, int policy, struct sched_param *param)
9745 struct sched_param lp;
9746 - struct task_struct *p;
9747 + prio_array_t *array;
9748 + unsigned long flags;
9754 if (!param || pid < 0)
9755 @@ -915,14 +1249,19 @@
9756 * We play safe to avoid deadlocks.
9758 read_lock_irq(&tasklist_lock);
9759 - spin_lock(&runqueue_lock);
9761 p = find_process_by_pid(pid);
9767 + goto out_unlock_tasklist;
9770 + * To be able to change p->policy safely, the apropriate
9771 + * runqueue lock must be held.
9773 + rq = task_rq_lock(p, &flags);
9778 @@ -931,40 +1270,48 @@
9779 policy != SCHED_OTHER)
9785 - * Valid priorities for SCHED_FIFO and SCHED_RR are 1..99, valid
9786 - * priority for SCHED_OTHER is 0.
9787 + * Valid priorities for SCHED_FIFO and SCHED_RR are
9788 + * 1..MAX_USER_RT_PRIO-1, valid priority for SCHED_OTHER is 0.
9791 - if (lp.sched_priority < 0 || lp.sched_priority > 99)
9792 + if (lp.sched_priority < 0 || lp.sched_priority > MAX_USER_RT_PRIO-1)
9794 if ((policy == SCHED_OTHER) != (lp.sched_priority == 0))
9798 - if ((policy == SCHED_FIFO || policy == SCHED_RR) &&
9799 + if ((policy == SCHED_FIFO || policy == SCHED_RR) &&
9800 !capable(CAP_SYS_NICE))
9802 if ((current->euid != p->euid) && (current->euid != p->uid) &&
9803 !capable(CAP_SYS_NICE))
9808 + deactivate_task(p, task_rq(p));
9811 p->rt_priority = lp.sched_priority;
9813 - current->need_resched = 1;
9814 + if (policy != SCHED_OTHER)
9815 + p->prio = MAX_USER_RT_PRIO-1 - p->rt_priority;
9817 + p->prio = p->static_prio;
9819 + activate_task(p, task_rq(p));
9822 - spin_unlock(&runqueue_lock);
9823 + task_rq_unlock(rq, &flags);
9824 +out_unlock_tasklist:
9825 read_unlock_irq(&tasklist_lock);
9831 -asmlinkage long sys_sched_setscheduler(pid_t pid, int policy,
9832 +asmlinkage long sys_sched_setscheduler(pid_t pid, int policy,
9833 struct sched_param *param)
9835 return setscheduler(pid, policy, param);
9836 @@ -977,7 +1324,7 @@
9838 asmlinkage long sys_sched_getscheduler(pid_t pid)
9840 - struct task_struct *p;
9845 @@ -988,7 +1335,7 @@
9846 read_lock(&tasklist_lock);
9847 p = find_process_by_pid(pid);
9849 - retval = p->policy & ~SCHED_YIELD;
9850 + retval = p->policy;
9851 read_unlock(&tasklist_lock);
9854 @@ -997,7 +1344,7 @@
9856 asmlinkage long sys_sched_getparam(pid_t pid, struct sched_param *param)
9858 - struct task_struct *p;
9860 struct sched_param lp;
9863 @@ -1028,42 +1375,64 @@
9865 asmlinkage long sys_sched_yield(void)
9868 - * Trick. sched_yield() first counts the number of truly
9869 - * 'pending' runnable processes, then returns if it's
9870 - * only the current processes. (This test does not have
9871 - * to be atomic.) In threaded applications this optimization
9872 - * gets triggered quite often.
9874 + runqueue_t *rq = this_rq();
9875 + prio_array_t *array;
9878 - int nr_pending = nr_running;
9879 + spin_lock_irq(&rq->lock);
9881 + if (unlikely(rq->nr_running == 1)) {
9882 + spin_unlock_irq(&rq->lock);
9888 + array = current->array;
9889 + if (unlikely(rt_task(current))) {
9890 + list_del(¤t->run_list);
9891 + list_add_tail(¤t->run_list, array->queue + current->prio);
9895 - // Subtract non-idle processes running on other CPUs.
9896 - for (i = 0; i < smp_num_cpus; i++) {
9897 - int cpu = cpu_logical_map(i);
9898 - if (aligned_data[cpu].schedule_data.curr != idle_task(cpu))
9900 + if (unlikely(array == rq->expired) && rq->active->nr_active)
9903 + list_del(¤t->run_list);
9904 + if (!list_empty(array->queue + current->prio)) {
9905 + list_add(¤t->run_list, array->queue[current->prio].next);
9909 - // on UP this process is on the runqueue as well
9914 + __clear_bit(current->prio, array->bitmap);
9915 + if (likely(array == rq->active) && array->nr_active == 1) {
9917 - * This process can only be rescheduled by us,
9918 - * so this is safe without any locking.
9919 + * We're the last task in the active queue so
9920 + * we must move ourself to the expired array
9921 + * to avoid running again immediatly.
9923 - if (current->policy == SCHED_OTHER)
9924 - current->policy |= SCHED_YIELD;
9925 - current->need_resched = 1;
9927 - spin_lock_irq(&runqueue_lock);
9928 - move_last_runqueue(current);
9929 - spin_unlock_irq(&runqueue_lock);
9930 + array->nr_active--;
9931 + array = rq->expired;
9932 + array->nr_active++;
9935 + i = sched_find_first_bit(array->bitmap);
9937 + BUG_ON(i == MAX_PRIO);
9938 + BUG_ON(i == current->prio && array == current->array);
9940 + if (array == current->array && i < current->prio)
9941 + i = current->prio;
9943 + current->array = array;
9944 + current->prio = i;
9946 + list_add(¤t->run_list, array->queue[i].next);
9947 + __set_bit(i, array->bitmap);
9950 + spin_unlock_irq(&rq->lock);
9957 @@ -1075,14 +1444,13 @@
9961 - set_current_state(TASK_RUNNING);
9962 + __set_current_state(TASK_RUNNING);
9967 void __cond_resched(void)
9969 - set_current_state(TASK_RUNNING);
9970 + __set_current_state(TASK_RUNNING);
9974 @@ -1093,7 +1461,7 @@
9979 + ret = MAX_USER_RT_PRIO-1;
9983 @@ -1120,7 +1488,7 @@
9984 asmlinkage long sys_sched_rr_get_interval(pid_t pid, struct timespec *interval)
9987 - struct task_struct *p;
9989 int retval = -EINVAL;
9992 @@ -1130,8 +1498,8 @@
9993 read_lock(&tasklist_lock);
9994 p = find_process_by_pid(pid);
9996 - jiffies_to_timespec(p->policy & SCHED_FIFO ? 0 : NICE_TO_TICKS(p->nice),
9998 + jiffies_to_timespec(p->policy & SCHED_FIFO ?
9999 + 0 : TASK_TIMESLICE(p), &t);
10000 read_unlock(&tasklist_lock);
10002 retval = copy_to_user(interval, &t, sizeof(t)) ? -EFAULT : 0;
10003 @@ -1139,14 +1507,14 @@
10007 -static void show_task(struct task_struct * p)
10008 +static void show_task(task_t * p)
10010 unsigned long free = 0;
10012 static const char * stat_nam[] = { "R", "S", "D", "Z", "T", "W" };
10014 printk("%-13.13s ", p->comm);
10015 - state = p->state ? ffz(~p->state) + 1 : 0;
10016 + state = p->state ? __ffs(p->state) + 1 : 0;
10017 if (((unsigned) state) < sizeof(stat_nam)/sizeof(char *))
10018 printk(stat_nam[state]);
10020 @@ -1187,7 +1555,7 @@
10021 printk(" (NOTLB)\n");
10024 - extern void show_trace_task(struct task_struct *tsk);
10025 + extern void show_trace_task(task_t *tsk);
10026 show_trace_task(p);
10029 @@ -1209,7 +1577,7 @@
10031 void show_state(void)
10033 - struct task_struct *p;
10036 #if (BITS_PER_LONG == 32)
10038 @@ -1232,128 +1600,280 @@
10039 read_unlock(&tasklist_lock);
10043 - * reparent_to_init() - Reparent the calling kernel thread to the init task.
10045 - * If a kernel thread is launched as a result of a system call, or if
10046 - * it ever exits, it should generally reparent itself to init so that
10047 - * it is correctly cleaned up on exit.
10049 + * double_rq_lock - safely lock two runqueues
10051 - * The various task state such as scheduling policy and priority may have
10052 - * been inherited fro a user process, so we reset them to sane values here.
10053 + * Note this does not disable interrupts like task_rq_lock,
10054 + * you need to do so manually before calling.
10056 +static inline void double_rq_lock(runqueue_t *rq1, runqueue_t *rq2)
10059 + spin_lock(&rq1->lock);
10062 + spin_lock(&rq1->lock);
10063 + spin_lock(&rq2->lock);
10065 + spin_lock(&rq2->lock);
10066 + spin_lock(&rq1->lock);
10072 + * double_rq_unlock - safely unlock two runqueues
10074 - * NOTE that reparent_to_init() gives the caller full capabilities.
10075 + * Note this does not restore interrupts like task_rq_unlock,
10076 + * you need to do so manually after calling.
10078 -void reparent_to_init(void)
10079 +static inline void double_rq_unlock(runqueue_t *rq1, runqueue_t *rq2)
10081 - struct task_struct *this_task = current;
10082 + spin_unlock(&rq1->lock);
10084 + spin_unlock(&rq2->lock);
10087 - write_lock_irq(&tasklist_lock);
10088 +void __init init_idle(task_t *idle, int cpu)
10090 + runqueue_t *idle_rq = cpu_rq(cpu), *rq = cpu_rq(idle->cpu);
10091 + unsigned long flags;
10093 - /* Reparent to init */
10094 - REMOVE_LINKS(this_task);
10095 - this_task->p_pptr = child_reaper;
10096 - this_task->p_opptr = child_reaper;
10097 - SET_LINKS(this_task);
10098 + __save_flags(flags);
10100 + double_rq_lock(idle_rq, rq);
10102 + idle_rq->curr = idle_rq->idle = idle;
10103 + deactivate_task(idle, rq);
10104 + idle->array = NULL;
10105 + idle->prio = MAX_PRIO;
10106 + idle->state = TASK_RUNNING;
10108 + double_rq_unlock(idle_rq, rq);
10109 + set_tsk_need_resched(idle);
10110 + __restore_flags(flags);
10113 - /* Set the exit signal to SIGCHLD so we signal init on exit */
10114 - this_task->exit_signal = SIGCHLD;
10115 +extern void init_timervecs(void);
10116 +extern void timer_bh(void);
10117 +extern void tqueue_bh(void);
10118 +extern void immediate_bh(void);
10120 - /* We also take the runqueue_lock while altering task fields
10121 - * which affect scheduling decisions */
10122 - spin_lock(&runqueue_lock);
10123 +void __init sched_init(void)
10128 + for (i = 0; i < NR_CPUS; i++) {
10129 + prio_array_t *array;
10131 - this_task->ptrace = 0;
10132 - this_task->nice = DEF_NICE;
10133 - this_task->policy = SCHED_OTHER;
10134 - /* cpus_allowed? */
10135 - /* rt_priority? */
10137 - this_task->cap_effective = CAP_INIT_EFF_SET;
10138 - this_task->cap_inheritable = CAP_INIT_INH_SET;
10139 - this_task->cap_permitted = CAP_FULL_SET;
10140 - this_task->keep_capabilities = 0;
10141 - memcpy(this_task->rlim, init_task.rlim, sizeof(*(this_task->rlim)));
10142 - this_task->user = INIT_USER;
10144 + rq->active = rq->arrays;
10145 + rq->expired = rq->arrays + 1;
10146 + spin_lock_init(&rq->lock);
10148 + INIT_LIST_HEAD(&rq->migration_queue);
10151 - spin_unlock(&runqueue_lock);
10152 - write_unlock_irq(&tasklist_lock);
10153 + for (j = 0; j < 2; j++) {
10154 + array = rq->arrays + j;
10155 + for (k = 0; k < MAX_PRIO; k++) {
10156 + INIT_LIST_HEAD(array->queue + k);
10157 + __clear_bit(k, array->bitmap);
10159 + // delimiter for bitsearch
10160 + __set_bit(MAX_PRIO, array->bitmap);
10164 + * We have to do a little magic to get the first
10165 + * process right in SMP mode.
10168 + rq->curr = current;
10169 + rq->idle = current;
10170 + current->cpu = smp_processor_id();
10171 + wake_up_process(current);
10173 + init_timervecs();
10174 + init_bh(TIMER_BH, timer_bh);
10175 + init_bh(TQUEUE_BH, tqueue_bh);
10176 + init_bh(IMMEDIATE_BH, immediate_bh);
10179 + * The boot idle thread does lazy MMU switching as well:
10181 + atomic_inc(&init_mm.mm_count);
10182 + enter_lazy_tlb(&init_mm, current, smp_processor_id());
10188 - * Put all the gunge required to become a kernel thread without
10189 - * attached user resources in one place where it belongs.
10190 + * This is how migration works:
10192 + * 1) we queue a migration_req_t structure in the source CPU's
10193 + * runqueue and wake up that CPU's migration thread.
10194 + * 2) we down() the locked semaphore => thread blocks.
10195 + * 3) migration thread wakes up (implicitly it forces the migrated
10196 + * thread off the CPU)
10197 + * 4) it gets the migration request and checks whether the migrated
10198 + * task is still in the wrong runqueue.
10199 + * 5) if it's in the wrong runqueue then the migration thread removes
10200 + * it and puts it into the right queue.
10201 + * 6) migration thread up()s the semaphore.
10202 + * 7) we wake up and the migration is done.
10205 -void daemonize(void)
10209 + struct completion done;
10210 +} migration_req_t;
10213 + * Change a given task's CPU affinity. Migrate the process to a
10214 + * proper CPU and schedule it away if the CPU it's executing on
10215 + * is removed from the allowed bitmask.
10217 + * NOTE: the caller must have a valid reference to the task, the
10218 + * task must not exit() & deallocate itself prematurely. The
10219 + * call is not atomic; no spinlocks may be held.
10221 +void set_cpus_allowed(task_t *p, unsigned long new_mask)
10223 - struct fs_struct *fs;
10224 + unsigned long flags;
10225 + migration_req_t req;
10228 + new_mask &= cpu_online_map;
10232 + rq = task_rq_lock(p, &flags);
10233 + p->cpus_allowed = new_mask;
10235 - * If we were started as result of loading a module, close all of the
10236 - * user space pages. We don't need them, and if we didn't close them
10237 - * they would be locked into memory.
10238 + * Can the task run on the task's current CPU? If not then
10239 + * migrate the process off to a proper CPU.
10241 - exit_mm(current);
10242 + if (new_mask & (1UL << p->cpu)) {
10243 + task_rq_unlock(rq, &flags);
10247 - current->session = 1;
10248 - current->pgrp = 1;
10249 - current->tty = NULL;
10251 + * If the task is not on a runqueue, then it is safe to
10252 + * simply update the task's cpu field.
10254 + if (!p->array && (p != rq->curr)) {
10255 + p->cpu = __ffs(p->cpus_allowed);
10256 + task_rq_unlock(rq, &flags);
10260 - /* Become as one with the init task */
10261 + init_completion(&req.done);
10263 + list_add(&req.list, &rq->migration_queue);
10264 + task_rq_unlock(rq, &flags);
10265 + wake_up_process(rq->migration_thread);
10267 - exit_fs(current); /* current->fs->count--; */
10268 - fs = init_task.fs;
10269 - current->fs = fs;
10270 - atomic_inc(&fs->count);
10271 - exit_files(current);
10272 - current->files = init_task.files;
10273 - atomic_inc(¤t->files->count);
10274 + wait_for_completion(&req.done);
10277 -extern unsigned long wait_init_idle;
10278 +static __initdata int master_migration_thread;
10280 -void __init init_idle(void)
10281 +static int migration_thread(void * bind_cpu)
10283 - struct schedule_data * sched_data;
10284 - sched_data = &aligned_data[smp_processor_id()].schedule_data;
10285 + int cpu = cpu_logical_map((int) (long) bind_cpu);
10286 + struct sched_param param = { sched_priority: MAX_RT_PRIO-1 };
10290 - if (current != &init_task && task_on_runqueue(current)) {
10291 - printk("UGH! (%d:%d) was on the runqueue, removing.\n",
10292 - smp_processor_id(), current->pid);
10293 - del_from_runqueue(current);
10295 + sigfillset(¤t->blocked);
10296 + set_fs(KERNEL_DS);
10298 + * The first migration thread is started on the boot CPU, it
10299 + * migrates the other migration threads to their destination CPUs.
10301 + if (cpu != master_migration_thread) {
10302 + while (!cpu_rq(master_migration_thread)->migration_thread)
10304 + set_cpus_allowed(current, 1UL << cpu);
10306 - sched_data->curr = current;
10307 - sched_data->last_schedule = get_cycles();
10308 - clear_bit(current->processor, &wait_init_idle);
10310 + printk("migration_task %d on cpu=%d\n", cpu, smp_processor_id());
10311 + ret = setscheduler(0, SCHED_FIFO, ¶m);
10313 -extern void init_timervecs (void);
10315 + rq->migration_thread = current;
10317 -void __init sched_init(void)
10320 - * We have to do a little magic to get the first
10321 - * process right in SMP mode.
10323 - int cpu = smp_processor_id();
10325 + sprintf(current->comm, "migration_CPU%d", smp_processor_id());
10327 - init_task.processor = cpu;
10329 + runqueue_t *rq_src, *rq_dest;
10330 + struct list_head *head;
10331 + int cpu_src, cpu_dest;
10332 + migration_req_t *req;
10333 + unsigned long flags;
10336 - for(nr = 0; nr < PIDHASH_SZ; nr++)
10337 - pidhash[nr] = NULL;
10338 + spin_lock_irqsave(&rq->lock, flags);
10339 + head = &rq->migration_queue;
10340 + current->state = TASK_INTERRUPTIBLE;
10341 + if (list_empty(head)) {
10342 + spin_unlock_irqrestore(&rq->lock, flags);
10346 + req = list_entry(head->next, migration_req_t, list);
10347 + list_del_init(head->next);
10348 + spin_unlock_irqrestore(&rq->lock, flags);
10351 + cpu_dest = __ffs(p->cpus_allowed);
10352 + rq_dest = cpu_rq(cpu_dest);
10354 + cpu_src = p->cpu;
10355 + rq_src = cpu_rq(cpu_src);
10357 + local_irq_save(flags);
10358 + double_rq_lock(rq_src, rq_dest);
10359 + if (p->cpu != cpu_src) {
10360 + double_rq_unlock(rq_src, rq_dest);
10361 + local_irq_restore(flags);
10364 + if (rq_src == rq) {
10365 + p->cpu = cpu_dest;
10367 + deactivate_task(p, rq_src);
10368 + activate_task(p, rq_dest);
10371 + double_rq_unlock(rq_src, rq_dest);
10372 + local_irq_restore(flags);
10374 - init_timervecs();
10375 + complete(&req->done);
10379 - init_bh(TIMER_BH, timer_bh);
10380 - init_bh(TQUEUE_BH, tqueue_bh);
10381 - init_bh(IMMEDIATE_BH, immediate_bh);
10382 +void __init migration_init(void)
10387 - * The boot idle thread does lazy MMU switching as well:
10389 - atomic_inc(&init_mm.mm_count);
10390 - enter_lazy_tlb(&init_mm, current, cpu);
10391 + master_migration_thread = smp_processor_id();
10392 + current->cpus_allowed = 1UL << master_migration_thread;
10394 + for (cpu = 0; cpu < smp_num_cpus; cpu++) {
10395 + if (kernel_thread(migration_thread, (void *) (long) cpu,
10396 + CLONE_FS | CLONE_FILES | CLONE_SIGNAL) < 0)
10399 + current->cpus_allowed = -1L;
10401 + for (cpu = 0; cpu < smp_num_cpus; cpu++)
10402 + while (!cpu_rq(cpu_logical_map(cpu))->migration_thread)
10403 + schedule_timeout(2);
10406 +#endif /* CONFIG_SMP */
10407 diff -urN linux-2.4.20/kernel/signal.c linux-2.4.20-o1/kernel/signal.c
10408 --- linux-2.4.20/kernel/signal.c Fri Nov 29 00:53:15 2002
10409 +++ linux-2.4.20-o1/kernel/signal.c Wed Mar 12 00:41:43 2003
10410 @@ -490,12 +490,9 @@
10411 * process of changing - but no harm is done by that
10412 * other than doing an extra (lightweight) IPI interrupt.
10414 - spin_lock(&runqueue_lock);
10415 - if (task_has_cpu(t) && t->processor != smp_processor_id())
10416 - smp_send_reschedule(t->processor);
10417 - spin_unlock(&runqueue_lock);
10418 -#endif /* CONFIG_SMP */
10420 + if ((t->state == TASK_RUNNING) && (t->cpu != cpu()))
10421 + kick_if_running(t);
10423 if (t->state & TASK_INTERRUPTIBLE) {
10424 wake_up_process(t);
10426 diff -urN linux-2.4.20/kernel/softirq.c linux-2.4.20-o1/kernel/softirq.c
10427 --- linux-2.4.20/kernel/softirq.c Fri Nov 29 00:53:15 2002
10428 +++ linux-2.4.20-o1/kernel/softirq.c Wed Mar 12 00:41:43 2003
10429 @@ -364,13 +364,13 @@
10430 int cpu = cpu_logical_map(bind_cpu);
10433 - current->nice = 19;
10434 + set_user_nice(current, 19);
10435 sigfillset(¤t->blocked);
10437 /* Migrate to the right CPU */
10438 - current->cpus_allowed = 1UL << cpu;
10439 - while (smp_processor_id() != cpu)
10441 + set_cpus_allowed(current, 1UL << cpu);
10442 + if (cpu() != cpu)
10445 sprintf(current->comm, "ksoftirqd_CPU%d", bind_cpu);
10447 @@ -395,7 +395,7 @@
10451 -static __init int spawn_ksoftirqd(void)
10452 +__init int spawn_ksoftirqd(void)
10456 diff -urN linux-2.4.20/kernel/sys.c linux-2.4.20-o1/kernel/sys.c
10457 --- linux-2.4.20/kernel/sys.c Sat Aug 3 02:39:46 2002
10458 +++ linux-2.4.20-o1/kernel/sys.c Wed Mar 12 00:41:43 2003
10459 @@ -220,10 +220,10 @@
10461 if (error == -ESRCH)
10463 - if (niceval < p->nice && !capable(CAP_SYS_NICE))
10464 + if (niceval < task_nice(p) && !capable(CAP_SYS_NICE))
10467 - p->nice = niceval;
10468 + set_user_nice(p, niceval);
10470 read_unlock(&tasklist_lock);
10472 @@ -249,7 +249,7 @@
10474 if (!proc_sel(p, which, who))
10476 - niceval = 20 - p->nice;
10477 + niceval = 20 - task_nice(p);
10478 if (niceval > retval)
10481 diff -urN linux-2.4.20/kernel/timer.c linux-2.4.20-o1/kernel/timer.c
10482 --- linux-2.4.20/kernel/timer.c Fri Nov 29 00:53:15 2002
10483 +++ linux-2.4.20-o1/kernel/timer.c Wed Mar 12 00:41:43 2003
10486 #include <asm/uaccess.h>
10488 +struct kernel_stat kstat;
10491 * Timekeeping variables
10493 @@ -598,25 +600,7 @@
10494 int cpu = smp_processor_id(), system = user_tick ^ 1;
10496 update_one_process(p, user_tick, system, cpu);
10498 - if (--p->counter <= 0) {
10501 - * SCHED_FIFO is priority preemption, so this is
10502 - * not the place to decide whether to reschedule a
10503 - * SCHED_FIFO task or not - Bhavesh Davda
10505 - if (p->policy != SCHED_FIFO) {
10506 - p->need_resched = 1;
10510 - kstat.per_cpu_nice[cpu] += user_tick;
10512 - kstat.per_cpu_user[cpu] += user_tick;
10513 - kstat.per_cpu_system[cpu] += system;
10514 - } else if (local_bh_count(cpu) || local_irq_count(cpu) > 1)
10515 - kstat.per_cpu_system[cpu] += system;
10516 + scheduler_tick(user_tick, system);
10520 @@ -624,17 +608,7 @@
10522 static unsigned long count_active_tasks(void)
10524 - struct task_struct *p;
10525 - unsigned long nr = 0;
10527 - read_lock(&tasklist_lock);
10528 - for_each_task(p) {
10529 - if ((p->state == TASK_RUNNING ||
10530 - (p->state & TASK_UNINTERRUPTIBLE)))
10533 - read_unlock(&tasklist_lock);
10535 + return (nr_running() + nr_uninterruptible()) * FIXED_1;
10539 @@ -827,6 +801,89 @@
10543 +static void process_timeout(unsigned long __data)
10545 + wake_up_process((task_t *)__data);
10549 + * schedule_timeout - sleep until timeout
10550 + * @timeout: timeout value in jiffies
10552 + * Make the current task sleep until @timeout jiffies have
10553 + * elapsed. The routine will return immediately unless
10554 + * the current task state has been set (see set_current_state()).
10556 + * You can set the task state as follows -
10558 + * %TASK_UNINTERRUPTIBLE - at least @timeout jiffies are guaranteed to
10559 + * pass before the routine returns. The routine will return 0
10561 + * %TASK_INTERRUPTIBLE - the routine may return early if a signal is
10562 + * delivered to the current task. In this case the remaining time
10563 + * in jiffies will be returned, or 0 if the timer expired in time
10565 + * The current task state is guaranteed to be TASK_RUNNING when this
10566 + * routine returns.
10568 + * Specifying a @timeout value of %MAX_SCHEDULE_TIMEOUT will schedule
10569 + * the CPU away without a bound on the timeout. In this case the return
10570 + * value will be %MAX_SCHEDULE_TIMEOUT.
10572 + * In all cases the return value is guaranteed to be non-negative.
10574 +signed long schedule_timeout(signed long timeout)
10576 + struct timer_list timer;
10577 + unsigned long expire;
10581 + case MAX_SCHEDULE_TIMEOUT:
10583 + * These two special cases are useful to be comfortable
10584 + * in the caller. Nothing more. We could take
10585 + * MAX_SCHEDULE_TIMEOUT from one of the negative value
10586 + * but I' d like to return a valid offset (>=0) to allow
10587 + * the caller to do everything it want with the retval.
10593 + * Another bit of PARANOID. Note that the retval will be
10594 + * 0 since no piece of kernel is supposed to do a check
10595 + * for a negative retval of schedule_timeout() (since it
10596 + * should never happens anyway). You just have the printk()
10597 + * that will tell you if something is gone wrong and where.
10601 + printk(KERN_ERR "schedule_timeout: wrong timeout "
10602 + "value %lx from %p\n", timeout,
10603 + __builtin_return_address(0));
10604 + current->state = TASK_RUNNING;
10609 + expire = timeout + jiffies;
10611 + init_timer(&timer);
10612 + timer.expires = expire;
10613 + timer.data = (unsigned long) current;
10614 + timer.function = process_timeout;
10616 + add_timer(&timer);
10618 + del_timer_sync(&timer);
10620 + timeout = expire - jiffies;
10623 + return timeout < 0 ? 0 : timeout;
10626 /* Thread ID - the internal kernel "pid" */
10627 asmlinkage long sys_gettid(void)
10629 @@ -873,4 +930,3 @@
10634 diff -urN linux-2.4.20/mm/oom_kill.c linux-2.4.20-o1/mm/oom_kill.c
10635 --- linux-2.4.20/mm/oom_kill.c Fri Nov 29 00:53:15 2002
10636 +++ linux-2.4.20-o1/mm/oom_kill.c Wed Mar 12 00:41:43 2003
10638 * Niced processes are most likely less important, so double
10639 * their badness points.
10642 + if (task_nice(p) > 0)
10646 @@ -146,7 +146,7 @@
10647 * all the memory it needs. That way it should be able to
10648 * exit() and clear out its resources quickly...
10650 - p->counter = 5 * HZ;
10651 + p->time_slice = HZ;
10652 p->flags |= PF_MEMALLOC | PF_MEMDIE;
10654 /* This process has hardware access, be more careful. */
10655 diff -urN linux-2.4.20/net/bluetooth/bnep/core.c linux-2.4.20-o1/net/bluetooth/bnep/core.c
10656 --- linux-2.4.20/net/bluetooth/bnep/core.c Fri Nov 29 00:53:15 2002
10657 +++ linux-2.4.20-o1/net/bluetooth/bnep/core.c Wed Mar 12 00:41:43 2003
10658 @@ -458,7 +458,7 @@
10659 sigfillset(¤t->blocked);
10660 flush_signals(current);
10662 - current->nice = -15;
10663 + set_user_nice(current, -15);