[packages/crossavr-libc.git] / 506-avr-libc-optimize_dox.patch

diff -Naurp doc/api/optimize.dox doc/api/optimize.dox
--- doc/api/optimize.dox	1970-01-01 05:30:00.000000000 +0530
+++ doc/api/optimize.dox	2012-07-25 14:29:02.000000000 +0530
@@ -0,0 +1,137 @@
+/* Copyright (c) 2010 Jan Waclawek
+   Copyright (c) 2010 Joerg Wunsch
+   All rights reserved.
+
+   Redistribution and use in source and binary forms, with or without
+   modification, are permitted provided that the following conditions are met:
+
+   * Redistributions of source code must retain the above copyright
+     notice, this list of conditions and the following disclaimer.
+   * Redistributions in binary form must reproduce the above copyright
+     notice, this list of conditions and the following disclaimer in
+     the documentation and/or other materials provided with the
+     distribution.
+
+  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+  AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+  IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+  ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+  LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+  CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+  SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+  INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+  CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+  ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+  POSSIBILITY OF SUCH DAMAGE. */
+
+/* $Id$ */
+
+/** \page optimization Compiler optimization
+
+\section optim_code_reorder Problems with reordering code
+\author Jan Waclawek
+
+Programs contain sequences of statements, and a naive compiler would
+execute them exactly in the order as they are written. But an
+optimizing compiler is free to \e reorder the statements - or even
+parts of them - if the resulting "net effect" is the same. The
+"measure" of the "net effect" is what the standard calls "side
+effects", and is accomplished exclusively through accesses (reads and
+writes) to variables qualified as \c volatile. So, as long as all
+volatile reads and writes are to the same addresses and in the same
+order (and writes write the same values), the program is correct,
+regardless of other operations in it. (One important point to note
+here is, that time duration between consecutive volatile accesses is
+not considered at all.)
+
+Unfortunately, there are also operations which are not covered by
+volatile accesses. An example of this in avr-gcc/avr-libc are the
+cli() and sei() macros defined in <avr/interrupt.h>, which convert
+directly to the respective assembler mnemonics through the __asm__()
+statement. These don't constitute a variable access at all, not even
+volatile, so the compiler is free to move them around. Although there
+is a "volatile" qualifier which can be attached to the __asm__()
+statement, its effect on (re)ordering is not clear from the
+documentation (and is more likely only to prevent complete removal by
+the optimiser), as it (among other) states:
+
+<em>Note that even a volatile asm instruction can be moved
+relative to other code, including across jump instructions. [...]
+Similarly, you can't expect a sequence of volatile asm instructions to
+remain perfectly consecutive.</em>
+
+\sa http://gcc.gnu.org/onlinedocs/gcc-4.3.4/gcc/Extended-Asm.html
+
+There is another mechanism which can be used to achieve something
+similar: <em>memory barriers</em>. This is accomplished through adding a
+special "memory" clobber to the inline \c asm statement, and ensures that
+all variables are flushed from registers to memory before the
+statement, and then re-read after the statement. The purpose of memory
+barriers is slightly different than to enforce code ordering: it is
+supposed to ensure that there are no variables "cached" in registers,
+so that it is safe to change the content of registers e.g. when
+switching context in a multitasking OS (on "big" processors with
+out-of-order execution they also imply usage of special instructions
+which force the processor into "in-order" state (this is not the case
+of AVRs)).
+
+However, memory barrier works well in ensuring that all volatile
+accesses before and after the barrier occur in the given order with
+respect to the barrier. However, it does not ensure the compiler
+moving non-volatile-related statements across the barrier. Peter
+Dannegger provided a nice example of this effect:
+
+\code
+#define cli() __asm volatile( "cli" ::: "memory" )
+#define sei() __asm volatile( "sei" ::: "memory" )
+
+unsigned int ivar;
+
+void test2( unsigned int val )
+{
+  val = 65535U / val;
+
+  cli();
+
+  ivar = val;
+
+  sei();
+}
+\endcode
+
+compiles with optimisations switched on (-Os) to
+
+\verbatim
+00000112 <test2>:
+ 112:	bc 01       	movw	r22, r24
+ 114:	f8 94       	cli
+ 116:	8f ef       	ldi	r24, 0xFF	; 255
+ 118:	9f ef       	ldi	r25, 0xFF	; 255
+ 11a:	0e 94 96 00 	call	0x12c	; 0x12c <__udivmodhi4>
+ 11e:	70 93 01 02 	sts	0x0201, r23
+ 122:	60 93 00 02 	sts	0x0200, r22
+ 126:	78 94       	sei
+ 128:	08 95       	ret
+\endverbatim
+
+where the potentially slow division is moved across cli(),
+resulting in interrupts to be disabled longer than intended. Note,
+that the volatile access occurs in order with respect to cli() or
+sei(); so the "net effect" required by the standard is achieved as
+intended, it is "only" the timing which is off. However, for most of
+embedded applications, timing is an important, sometimes critical
+factor.
+
+\sa https://www.mikrocontroller.net/topic/65923
+
+Unfortunately, at the moment, in avr-gcc (nor in the C standard),
+there is no mechanism to enforce complete match of written and
+executed code ordering - except maybe of switching the optimization
+completely off (-O0), or writing all the critical code in assembly.
+
+To sum it up:
+
+\li memory barriers ensure proper ordering of volatile accesses
+\li memory barriers don't ensure statements with no volatile accesses to be reordered across the barrier
+
+*/
Commit	Line	Data
9fe267c2 PZ	1	diff -Naurp doc/api/optimize.dox doc/api/optimize.dox
	2	--- doc/api/optimize.dox 1970-01-01 05:30:00.000000000 +0530
	3	+++ doc/api/optimize.dox 2012-07-25 14:29:02.000000000 +0530
	4	@@ -0,0 +1,137 @@
	5	+/* Copyright (c) 2010 Jan Waclawek
	6	+ Copyright (c) 2010 Joerg Wunsch
	7	+ All rights reserved.
	8	+
	9	+ Redistribution and use in source and binary forms, with or without
	10	+ modification, are permitted provided that the following conditions are met:
	11	+
	12	+ * Redistributions of source code must retain the above copyright
	13	+ notice, this list of conditions and the following disclaimer.
	14	+ * Redistributions in binary form must reproduce the above copyright
	15	+ notice, this list of conditions and the following disclaimer in
	16	+ the documentation and/or other materials provided with the
	17	+ distribution.
	18	+
	19	+ THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
	20	+ AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
	21	+ IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
	22	+ ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
	23	+ LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
	24	+ CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
	25	+ SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
	26	+ INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
	27	+ CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
	28	+ ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
	29	+ POSSIBILITY OF SUCH DAMAGE. */
	30	+
	31	+/* $Id$ */
	32	+
	33	+/** \page optimization Compiler optimization
	34	+
	35	+\section optim_code_reorder Problems with reordering code
	36	+\author Jan Waclawek
	37	+
	38	+Programs contain sequences of statements, and a naive compiler would
	39	+execute them exactly in the order as they are written. But an
	40	+optimizing compiler is free to \e reorder the statements - or even
	41	+parts of them - if the resulting "net effect" is the same. The
	42	+"measure" of the "net effect" is what the standard calls "side
	43	+effects", and is accomplished exclusively through accesses (reads and
	44	+writes) to variables qualified as \c volatile. So, as long as all
	45	+volatile reads and writes are to the same addresses and in the same
	46	+order (and writes write the same values), the program is correct,
	47	+regardless of other operations in it. (One important point to note
	48	+here is, that time duration between consecutive volatile accesses is
	49	+not considered at all.)
	50	+
	51	+Unfortunately, there are also operations which are not covered by
	52	+volatile accesses. An example of this in avr-gcc/avr-libc are the
	53	+cli() and sei() macros defined in <avr/interrupt.h>, which convert
	54	+directly to the respective assembler mnemonics through the __asm__()
	55	+statement. These don't constitute a variable access at all, not even
	56	+volatile, so the compiler is free to move them around. Although there
	57	+is a "volatile" qualifier which can be attached to the __asm__()
	58	+statement, its effect on (re)ordering is not clear from the
	59	+documentation (and is more likely only to prevent complete removal by
	60	+the optimiser), as it (among other) states:
	61	+
	62	+<em>Note that even a volatile asm instruction can be moved
	63	+relative to other code, including across jump instructions. [...]
	64	+Similarly, you can't expect a sequence of volatile asm instructions to
65	+remain perfectly consecutive.</em>
66	+
67	+\sa http://gcc.gnu.org/onlinedocs/gcc-4.3.4/gcc/Extended-Asm.html
68	+
69	+There is another mechanism which can be used to achieve something
70	+similar: <em>memory barriers</em>. This is accomplished through adding a
71	+special "memory" clobber to the inline \c asm statement, and ensures that
72	+all variables are flushed from registers to memory before the
73	+statement, and then re-read after the statement. The purpose of memory
74	+barriers is slightly different than to enforce code ordering: it is
75	+supposed to ensure that there are no variables "cached" in registers,
76	+so that it is safe to change the content of registers e.g. when
77	+switching context in a multitasking OS (on "big" processors with
78	+out-of-order execution they also imply usage of special instructions
79	+which force the processor into "in-order" state (this is not the case
80	+of AVRs)).
81	+
82	+However, memory barrier works well in ensuring that all volatile
83	+accesses before and after the barrier occur in the given order with
84	+respect to the barrier. However, it does not ensure the compiler
85	+moving non-volatile-related statements across the barrier. Peter
86	+Dannegger provided a nice example of this effect:
87	+
88	+\code
89	+#define cli() __asm volatile( "cli" ::: "memory" )
90	+#define sei() __asm volatile( "sei" ::: "memory" )
91	+
92	+unsigned int ivar;
93	+
94	+void test2( unsigned int val )
95	+{
96	+ val = 65535U / val;
97	+
98	+ cli();
99	+
100	+ ivar = val;
101	+
102	+ sei();
103	+}
104	+\endcode
105	+
106	+compiles with optimisations switched on (-Os) to
107	+
108	+\verbatim
109	+00000112 <test2>:
110	+ 112: bc 01 movw r22, r24
111	+ 114: f8 94 cli
112	+ 116: 8f ef ldi r24, 0xFF ; 255
113	+ 118: 9f ef ldi r25, 0xFF ; 255
114	+ 11a: 0e 94 96 00 call 0x12c ; 0x12c <__udivmodhi4>
115	+ 11e: 70 93 01 02 sts 0x0201, r23
116	+ 122: 60 93 00 02 sts 0x0200, r22
117	+ 126: 78 94 sei
118	+ 128: 08 95 ret
119	+\endverbatim
120	+
121	+where the potentially slow division is moved across cli(),
122	+resulting in interrupts to be disabled longer than intended. Note,
123	+that the volatile access occurs in order with respect to cli() or
124	+sei(); so the "net effect" required by the standard is achieved as
125	+intended, it is "only" the timing which is off. However, for most of
126	+embedded applications, timing is an important, sometimes critical
127	+factor.
128	+
129	+\sa https://www.mikrocontroller.net/topic/65923
130	+
131	+Unfortunately, at the moment, in avr-gcc (nor in the C standard),
132	+there is no mechanism to enforce complete match of written and
133	+executed code ordering - except maybe of switching the optimization
134	+completely off (-O0), or writing all the critical code in assembly.
135	+
136	+To sum it up:
137	+
138	+\li memory barriers ensure proper ordering of volatile accesses
139	+\li memory barriers don't ensure statements with no volatile accesses to be reordered across the barrier
140	+
141	+*/