|
@@ -0,0 +1,13008 @@
|
|
|
+\input texinfo
|
|
|
+
|
|
|
+@c Copyright @copyright{} 2022 Richard Stallman and Free Software Foundation, Inc.
|
|
|
+
|
|
|
+(The work of Trevis Rothwell and Nelson Beebe has been assigned or
|
|
|
+licensed to the FSF.)
|
|
|
+
|
|
|
+@c move alignment later?
|
|
|
+
|
|
|
+@setfilename ./c
|
|
|
+@settitle GNU C Language Manual
|
|
|
+@documentencoding UTF-8
|
|
|
+
|
|
|
+@smallbook
|
|
|
+@synindex vr fn
|
|
|
+
|
|
|
+@copying
|
|
|
+Copyright @copyright{} 2022 Richard Stallman and Free Software Foundation, Inc.
|
|
|
+
|
|
|
+(The work of Trevis Rothwell and Nelson Beebe has been assigned or
|
|
|
+licensed to the FSF.)
|
|
|
+
|
|
|
+@quotation
|
|
|
+Permission is granted to copy, distribute and/or modify this document
|
|
|
+under the terms of the GNU Free Documentation License, Version 1.3 or
|
|
|
+any later version published by the Free Software Foundation; with the
|
|
|
+Invariant Sections being ``GNU General Public License,'' with the
|
|
|
+Front-Cover Texts being ``A GNU Manual,'' and with the Back-Cover
|
|
|
+Texts as in (a) below. A copy of the license is included in the
|
|
|
+section entitled ``GNU Free Documentation License.''
|
|
|
+
|
|
|
+(a) The FSF's Back-Cover Text is: ``You have the freedom to copy and
|
|
|
+modify this GNU manual. Buying copies from the FSF supports it in
|
|
|
+developing GNU and promoting software freedom.''
|
|
|
+@end quotation
|
|
|
+@end copying
|
|
|
+
|
|
|
+@dircategory Programming
|
|
|
+@direntry
|
|
|
+* C: (c). GNU C Language Intro and Reference Manual
|
|
|
+@end direntry
|
|
|
+
|
|
|
+@documentencoding UTF-8
|
|
|
+
|
|
|
+@titlepage
|
|
|
+@sp 6
|
|
|
+@center @titlefont{GNU C Language Intro and Reference Manual}
|
|
|
+@sp 4
|
|
|
+@c @center @value{EDITION} Edition
|
|
|
+@sp 5
|
|
|
+@center Richard Stallman
|
|
|
+@center and
|
|
|
+@center Trevis Rothwell
|
|
|
+@center plus Nelson Beebe
|
|
|
+@center on floating point
|
|
|
+@page
|
|
|
+@vskip 0pt plus 1filll
|
|
|
+
|
|
|
+@insertcopying
|
|
|
+
|
|
|
+@sp 2
|
|
|
+WILL BE Published by the Free Software Foundation @*
|
|
|
+51 Franklin Street, Fifth Floor @*
|
|
|
+Boston, MA 02110-1301 USA @*
|
|
|
+ISBN ?-??????-??-?
|
|
|
+
|
|
|
+@ignore
|
|
|
+@sp 1
|
|
|
+Cover art by J. Random Artist
|
|
|
+@end ignore
|
|
|
+
|
|
|
+@end titlepage
|
|
|
+
|
|
|
+@summarycontents
|
|
|
+@contents
|
|
|
+
|
|
|
+
|
|
|
+@node Top
|
|
|
+@ifnottex
|
|
|
+@top GNU C Manual
|
|
|
+@end ifnottex
|
|
|
+@iftex
|
|
|
+@top Preface
|
|
|
+@end iftex
|
|
|
+
|
|
|
+This manual explains the C language for use with the GNU Compiler
|
|
|
+Collection (GCC) on the GNU/Linux system and other systems. We refer
|
|
|
+to this dialect as GNU C. If you already know C, you can use this as
|
|
|
+a reference manual.
|
|
|
+
|
|
|
+If you understand basic concepts of programming but know nothing about
|
|
|
+C, you can read this manual sequentially from the beginning to learn
|
|
|
+the C language.
|
|
|
+
|
|
|
+If you are a beginner to programming, we recommend you first learn a
|
|
|
+language with automatic garbage collection and no explicit pointers,
|
|
|
+rather than starting with C@. Good choices include Lisp, Scheme,
|
|
|
+Python and Java. C's explicit pointers mean that programmers must be
|
|
|
+careful to avoid certain kinds of errors.
|
|
|
+
|
|
|
+C is a venerable language; it was first used in 1973. The GNU C
|
|
|
+Compiler, which was subsequently extended into the GNU Compiler
|
|
|
+Collection, was first released in 1987. Other important languages
|
|
|
+were designed based on C: once you know C, it gives you a useful base
|
|
|
+for learning C@t{++}, C#, Java, Scala, D, Go, and more.
|
|
|
+
|
|
|
+The special advantage of C is that it is fairly simple while allowing
|
|
|
+close access to the computer's hardware, which previously required
|
|
|
+writing in assembler language to describe the individual machine
|
|
|
+instructions. Some have called C a ``high-level assembler language''
|
|
|
+because of its explicit pointers and lack of automatic management of
|
|
|
+storage. As one wag put it, ``C combines the power of assembler
|
|
|
+language with the convenience of assembler language.'' However, C is
|
|
|
+far more portable, and much easier to read and write, than assembler
|
|
|
+language.
|
|
|
+
|
|
|
+This manual focuses on the GNU C language supported by the GNU
|
|
|
+Compiler Collection, version ???. When a construct may be absent or
|
|
|
+work differently in other C compilers, we say so. When it is not part
|
|
|
+of ISO standard C, we say it is a ``GNU C extension,'' because it is
|
|
|
+useful to know that; however, other dialects and standards are not the
|
|
|
+focus of this manual. We keep those notes short, unless it is vital
|
|
|
+to say more. For the same reason, we hardly mention C@t{++} or other
|
|
|
+languages that the GNU Compiler Collection supports.
|
|
|
+
|
|
|
+Some aspects of the meaning of C programs depend on the target
|
|
|
+platform: which computer, and which operating system, the compiled
|
|
|
+code will run on. Where this is the case, we say so.
|
|
|
+
|
|
|
+The C language provides no built-in facilities for performing such
|
|
|
+common operations as input/output, memory management, string
|
|
|
+manipulation, and the like. Instead, these facilities are defined in
|
|
|
+a standard library, which is automatically available in every C
|
|
|
+program. @xref{Top, The GNU C Library, , libc, The GNU C Library
|
|
|
+Reference Manual}.
|
|
|
+
|
|
|
+This manual incorporates the former GNU C Preprocessor Manual, which
|
|
|
+was among the earliest GNU Manuals. It also uses some text from the
|
|
|
+earlier GNU C Manual that was written by Trevis Rothwell and James
|
|
|
+Youngman.
|
|
|
+
|
|
|
+GNU C has many obscure features, each one either for historical
|
|
|
+compatibility or meant for very special situations. We have left them
|
|
|
+to a companion manual, the GNU C Obscurities Manual, which will be
|
|
|
+published digitally later.
|
|
|
+
|
|
|
+@menu
|
|
|
+* The First Example:: Getting started with basic C code.
|
|
|
+* Complete Program:: A whole example program
|
|
|
+ that can be compiled and run.
|
|
|
+* Storage:: Basic layout of storage; bytes.
|
|
|
+* Beyond Integers:: Exploring different numeric types.
|
|
|
+* Lexical Syntax:: The various lexical components of C programs.
|
|
|
+* Arithmetic:: Numeric computations.
|
|
|
+* Assignment Expressions:: Storing values in variables.
|
|
|
+* Execution Control Expressions:: Expressions combining values in various ways.
|
|
|
+* Binary Operator Grammar:: An overview of operator precedence.
|
|
|
+* Order of Execution:: The order of program execution.
|
|
|
+* Primitive Types:: More details about primitive data types.
|
|
|
+* Constants:: Explicit constant values:
|
|
|
+ details and examples.
|
|
|
+* Type Size:: The memory space occupied by a type.
|
|
|
+* Pointers:: Creating and manipulating memory pointers.
|
|
|
+* Structures:: Compound data types built
|
|
|
+ by grouping other types.
|
|
|
+* Arrays:: Creating and manipulating arrays.
|
|
|
+* Enumeration Types:: Sets of integers with named values.
|
|
|
+* Defining Typedef Names:: Using @code{typedef} to define type names.
|
|
|
+* Statements:: Controling program flow.
|
|
|
+* Variables:: Details about declaring, initializing,
|
|
|
+ and using variables.
|
|
|
+* Type Qualifiers:: Mark variables for certain intended uses.
|
|
|
+* Functions:: Declaring, defining, and calling functions.
|
|
|
+* Compatible Types:: How to tell if two types are compatible
|
|
|
+ with each other.
|
|
|
+* Type Conversions:: Converting between types.
|
|
|
+* Scope:: Different categories of identifier scope.
|
|
|
+* Preprocessing:: Using the GNU C preprocessor.
|
|
|
+* Integers in Depth:: How integer numbers are represented.
|
|
|
+* Floating Point in Depth:: How floating-point numbers are represented.
|
|
|
+* Compilation:: How to compile multi-file programs.
|
|
|
+* Directing Compilation:: Operations that affect compilation
|
|
|
+ but don't change the program.
|
|
|
+
|
|
|
+Appendices
|
|
|
+
|
|
|
+* Type Alignment:: Where in memory a type can validly start.
|
|
|
+* Aliasing:: Accessing the same data in two types.
|
|
|
+* Digraphs:: Two-character aliases for some characters.
|
|
|
+* Attributes:: Specifying additional information
|
|
|
+ in a declaration.
|
|
|
+* Signals:: Fatal errors triggered in various scenarios.
|
|
|
+* GNU Free Documentation License:: The license for this manual.
|
|
|
+* Symbol Index:: Keyword and symbol index.
|
|
|
+* Concept Index:: Detailed topical index.
|
|
|
+
|
|
|
+@detailmenu
|
|
|
+--- The Detailed Node Listing ---
|
|
|
+
|
|
|
+* Recursive Fibonacci:: Writing a simple function recursively.
|
|
|
+* Stack:: Each function call uses space in the stack.
|
|
|
+* Iterative Fibonacci:: Writing the same function iteratively.
|
|
|
+* Complete Example:: Turn the simple function into a full program.
|
|
|
+* Complete Explanation:: Explanation of each part of the example.
|
|
|
+* Complete Line-by-Line:: Explaining each line of the example.
|
|
|
+* Compile Example:: Using GCC to compile the example.
|
|
|
+* Float Example:: A function that uses floating-point numbers.
|
|
|
+* Array Example:: A function that works with arrays.
|
|
|
+* Array Example Call:: How to call that function.
|
|
|
+* Array Example Variations:: Different ways to write the call example.
|
|
|
+
|
|
|
+Lexical Syntax
|
|
|
+
|
|
|
+* English:: Write programs in English!
|
|
|
+* Characters:: The characters allowed in C programs.
|
|
|
+* Whitespace:: The particulars of whitespace characters.
|
|
|
+* Comments:: How to include comments in C code.
|
|
|
+* Identifiers:: How to form identifiers (names).
|
|
|
+* Operators/Punctuation:: Characters used as operators or punctuation.
|
|
|
+* Line Continuation:: Splitting one line into multiple lines.
|
|
|
+* Digraphs:: Two-character substitutes for some characters.
|
|
|
+
|
|
|
+Arithmetic
|
|
|
+
|
|
|
+* Basic Arithmetic:: Addition, subtraction, multiplication,
|
|
|
+ and division.
|
|
|
+* Integer Arithmetic:: How C performs arithmetic with integer values.
|
|
|
+* Integer Overflow:: When an integer value exceeds the range
|
|
|
+ of its type.
|
|
|
+* Mixed Mode:: Calculating with both integer values
|
|
|
+ and floating-point values.
|
|
|
+* Division and Remainder:: How integer division works.
|
|
|
+* Numeric Comparisons:: Comparing numeric values for
|
|
|
+ equality or order.
|
|
|
+* Shift Operations:: Shift integer bits left or right.
|
|
|
+* Bitwise Operations:: Bitwise conjunction, disjunction, negation.
|
|
|
+
|
|
|
+Assignment Expressions
|
|
|
+
|
|
|
+* Simple Assignment:: The basics of storing a value.
|
|
|
+* Lvalues:: Expressions into which a value can be stored.
|
|
|
+* Modifying Assignment:: Shorthand for changing an lvalue's contents.
|
|
|
+* Increment/Decrement:: Shorthand for incrementing and decrementing
|
|
|
+ an lvalue's contents.
|
|
|
+* Postincrement/Postdecrement:: Accessing then incrementing or decrementing.
|
|
|
+* Assignment in Subexpressions:: How to avoid ambiguity.
|
|
|
+* Write Assignments Separately:: Write assignments as separate statements.
|
|
|
+
|
|
|
+Execution Control Expressions
|
|
|
+
|
|
|
+* Logical Operators:: Logical conjunction, disjunction, negation.
|
|
|
+* Logicals and Comparison:: Logical operators with comparison operators.
|
|
|
+* Logicals and Assignments:: Assignments with logical operators.
|
|
|
+* Conditional Expression:: An if/else construct inside expressions.
|
|
|
+* Comma Operator:: Build a sequence of subexpressions.
|
|
|
+
|
|
|
+Order of Execution
|
|
|
+
|
|
|
+* Reordering of Operands:: Operations in C are not necessarily computed
|
|
|
+ in the order they are written.
|
|
|
+* Associativity and Ordering:: Some associative operations are performed
|
|
|
+ in a particular order; others are not.
|
|
|
+* Sequence Points:: Some guarantees about the order of operations.
|
|
|
+* Postincrement and Ordering:: Ambiguous excution order with postincrement.
|
|
|
+* Ordering of Operands:: Evaluation order of operands
|
|
|
+ and function arguments.
|
|
|
+* Optimization and Ordering:: Compiler optimizations can reorder operations
|
|
|
+ only if it has no impact on program results.
|
|
|
+
|
|
|
+Primitive Data Types
|
|
|
+
|
|
|
+* Integer Types:: Description of integer types.
|
|
|
+* Floating-Point Data Types:: Description of floating-point types.
|
|
|
+* Complex Data Types:: Description of complex number types.
|
|
|
+* The Void Type:: A type indicating no value at all.
|
|
|
+* Other Data Types:: A brief summary of other types.
|
|
|
+
|
|
|
+Constants
|
|
|
+
|
|
|
+* Integer Constants:: Literal integer values.
|
|
|
+* Integer Const Type:: Types of literal integer values.
|
|
|
+* Floating Constants:: Literal floating-point values.
|
|
|
+* Imaginary Constants:: Literal imaginary number values.
|
|
|
+* Invalid Numbers:: Avoiding preprocessing number misconceptions.
|
|
|
+* Character Constants:: Literal character values.
|
|
|
+* Unicode Character Codes:: Unicode characters represented
|
|
|
+ in either UTF-16 or UTF-32.
|
|
|
+* Wide Character Constants:: Literal characters values larger than 8 bits.
|
|
|
+* String Constants:: Literal string values.
|
|
|
+* UTF-8 String Constants:: Literal UTF-8 string values.
|
|
|
+* Wide String Constants:: Literal string values made up of
|
|
|
+ 16- or 32-bit characters.
|
|
|
+
|
|
|
+Pointers
|
|
|
+
|
|
|
+* Address of Data:: Using the ``address-of'' operator.
|
|
|
+* Pointer Types:: For each type, there is a pointer type.
|
|
|
+* Pointer Declarations:: Declaring variables with pointer types.
|
|
|
+* Pointer Type Designators:: Designators for pointer types.
|
|
|
+* Pointer Dereference:: Accessing what a pointer points at.
|
|
|
+* Null Pointers:: Pointers which do not point to any object.
|
|
|
+* Invalid Dereference:: Dereferencing null or invalid pointers.
|
|
|
+* Void Pointers:: Totally generic pointers, can cast to any.
|
|
|
+* Pointer Comparison:: Comparing memory address values.
|
|
|
+* Pointer Arithmetic:: Computing memory address values.
|
|
|
+* Pointers and Arrays:: Using pointer syntax instead of array syntax.
|
|
|
+* Pointer Arithmetic Low Level:: More about computing memory address values.
|
|
|
+* Pointer Increment/Decrement:: Incrementing and decrementing pointers.
|
|
|
+* Pointer Arithmetic Drawbacks:: A common pointer bug to watch out for.
|
|
|
+* Pointer-Integer Conversion:: Converting pointer types to integer types.
|
|
|
+* Printing Pointers:: Using @code{printf} for a pointer's value.
|
|
|
+
|
|
|
+Structures
|
|
|
+
|
|
|
+* Referencing Fields:: Accessing field values in a structure object.
|
|
|
+* Dynamic Memory Allocation:: Allocating space for objects
|
|
|
+ while the program is running.
|
|
|
+* Field Offset:: Memory layout of fields within a structure.
|
|
|
+* Structure Layout:: Planning the memory layout of fields.
|
|
|
+* Packed Structures:: Packing structure fields as close as possible.
|
|
|
+* Bit Fields:: Dividing integer fields
|
|
|
+ into fields with fewer bits.
|
|
|
+* Bit Field Packing:: How bit fields pack together in integers.
|
|
|
+* const Fields:: Making structure fields immutable.
|
|
|
+* Zero Length:: Zero-length array as a variable-length object.
|
|
|
+* Flexible Array Fields:: Another approach to variable-length objects.
|
|
|
+* Overlaying Structures:: Casting one structure type
|
|
|
+ over an object of another structure type.
|
|
|
+* Structure Assignment:: Assigning values to structure objects.
|
|
|
+* Unions:: Viewing the same object in different types.
|
|
|
+* Packing With Unions:: Using a union type to pack various types into
|
|
|
+ the same memory space.
|
|
|
+* Cast to Union:: Casting a value one of the union's alternative
|
|
|
+ types to the type of the union itself.
|
|
|
+* Structure Constructors:: Building new structure objects.
|
|
|
+* Unnamed Types as Fields:: Fields' types do not always need names.
|
|
|
+* Incomplete Types:: Types which have not been fully defined.
|
|
|
+* Intertwined Incomplete Types:: Defining mutually-recursive structue types.
|
|
|
+* Type Tags:: Scope of structure and union type tags.
|
|
|
+
|
|
|
+Arrays
|
|
|
+
|
|
|
+* Accessing Array Elements:: How to access individual elements of an array.
|
|
|
+* Declaring an Array:: How to name and reserve space for a new array.
|
|
|
+* Strings:: A string in C is a special case of array.
|
|
|
+* Incomplete Array Types:: Naming, but not allocating, a new array.
|
|
|
+* Limitations of C Arrays:: Arrays are not first-class objects.
|
|
|
+* Multidimensional Arrays:: Arrays of arrays.
|
|
|
+* Constructing Array Values:: Assigning values to an entire array at once.
|
|
|
+* Arrays of Variable Length:: Declaring arrays of non-constant size.
|
|
|
+
|
|
|
+Statements
|
|
|
+
|
|
|
+* Expression Statement:: Evaluate an expression, as a statement,
|
|
|
+ usually done for a side effect.
|
|
|
+* if Statement:: Basic conditional execution.
|
|
|
+* if-else Statement:: Multiple branches for conditional execution.
|
|
|
+* Blocks:: Grouping multiple statements together.
|
|
|
+* return Statement:: Return a value from a function.
|
|
|
+* Loop Statements:: Repeatedly executing a statement or block.
|
|
|
+* switch Statement:: Multi-way conditional choices.
|
|
|
+* switch Example:: A plausible example of using @code{switch}.
|
|
|
+* Duffs Device:: A special way to use @code{switch}.
|
|
|
+* Case Ranges:: Ranges of values for @code{switch} cases.
|
|
|
+* Null Statement:: A statement that does nothing.
|
|
|
+* goto Statement:: Jump to another point in the source code,
|
|
|
+ identified by a label.
|
|
|
+* Local Labels:: Labels with limited scope.
|
|
|
+* Labels as Values:: Getting the address of a label.
|
|
|
+* Statement Exprs:: A series of statements used as an expression.
|
|
|
+
|
|
|
+Variables
|
|
|
+
|
|
|
+* Variable Declarations:: Name a variable and and reserve space for it.
|
|
|
+* Initializers:: Assigning inital values to variables.
|
|
|
+* Designated Inits:: Assigning initial values to array elements
|
|
|
+ at particular array indices.
|
|
|
+* Auto Type:: Obtaining the type of a variable.
|
|
|
+* Local Variables:: Variables declared in function definitions.
|
|
|
+* File-Scope Variables:: Variables declared outside of
|
|
|
+ function definitions.
|
|
|
+* Static Local Variables:: Variables declared within functions,
|
|
|
+ but with permanent storage allocation.
|
|
|
+* Extern Declarations:: Declaring a variable
|
|
|
+ which is allocated somewhere else.
|
|
|
+* Allocating File-Scope:: When is space allocated
|
|
|
+ for file-scope variables?
|
|
|
+* auto and register:: Historically used storage directions.
|
|
|
+* Omitting Types:: The bad practice of declaring variables
|
|
|
+ with implicit type.
|
|
|
+
|
|
|
+Type Qualifiers
|
|
|
+
|
|
|
+* const:: Variables whose values don't change.
|
|
|
+* volatile:: Variables whose values may be accessed
|
|
|
+ or changed outside of the control of
|
|
|
+ this program.
|
|
|
+* restrict Pointers:: Restricted pointers for code optimization.
|
|
|
+* restrict Pointer Example:: Example of how that works.
|
|
|
+
|
|
|
+Functions
|
|
|
+
|
|
|
+* Function Definitions:: Writing the body of a function.
|
|
|
+* Function Declarations:: Declaring the interface of a function.
|
|
|
+* Function Calls:: Using functions.
|
|
|
+* Function Call Semantics:: Call-by-value argument passing.
|
|
|
+* Function Pointers:: Using references to functions.
|
|
|
+* The main Function:: Where execution of a GNU C program begins.
|
|
|
+
|
|
|
+Type Conversions
|
|
|
+
|
|
|
+* Explicit Type Conversion:: Casting a value from one type to another.
|
|
|
+* Assignment Type Conversions:: Automatic conversion by assignment operation.
|
|
|
+* Argument Promotions:: Automatic conversion of function parameters.
|
|
|
+* Operand Promotions:: Automatic conversion of arithmetic operands.
|
|
|
+* Common Type:: When operand types differ, which one is used?
|
|
|
+
|
|
|
+Scope
|
|
|
+
|
|
|
+* Scope:: Different categories of identifier scope.
|
|
|
+
|
|
|
+Preprocessing
|
|
|
+
|
|
|
+* Preproc Overview:: Introduction to the C preprocessor.
|
|
|
+* Directives:: The form of preprocessor directives.
|
|
|
+* Preprocessing Tokens:: The lexical elements of preprocessing.
|
|
|
+* Header Files:: Including one source file in another.
|
|
|
+* Macros:: Macro expansion by the preprocessor.
|
|
|
+* Conditionals:: Controling whether to compile some lines
|
|
|
+ or ignore them.
|
|
|
+* Diagnostics:: Reporting warnings and errors.
|
|
|
+* Line Control:: Reporting source line numbers.
|
|
|
+* Null Directive:: A preprocessing no-op.
|
|
|
+
|
|
|
+Integers in Depth
|
|
|
+
|
|
|
+* Integer Representations:: How integer values appear in memory.
|
|
|
+* Maximum and Minimum Values:: Value ranges of integer types.
|
|
|
+
|
|
|
+Floating Point in Depth
|
|
|
+
|
|
|
+* Floating Representations:: How floating-point values appear in memory.
|
|
|
+* Floating Type Specs:: Precise details of memory representations.
|
|
|
+* Special Float Values:: Infinity, Not a Number, and Subnormal Numbers.
|
|
|
+* Invalid Optimizations:: Don't mess up non-numbers and signed zeros.
|
|
|
+* Exception Flags:: Handling certain conditions in floating point.
|
|
|
+* Exact Floating-Point:: Not all floating calculations lose precision.
|
|
|
+* Rounding:: When a floating result can't be represented
|
|
|
+ exactly in the floating-point type in use.
|
|
|
+* Rounding Issues:: Avoid magnifying rounding errors.
|
|
|
+* Significance Loss:: Subtracting numbers that are almost equal.
|
|
|
+* Fused Multiply-Add:: Taking advantage of a special floating-point
|
|
|
+ instruction for faster execution.
|
|
|
+* Error Recovery:: Determining rounding errors.
|
|
|
+* Exact Floating Constants:: Precisely specified floating-point numbers.
|
|
|
+* Handling Infinity:: When floating calculation is out of range.
|
|
|
+* Handling NaN:: What floating calculation is undefined.
|
|
|
+* Signed Zeros:: Positive zero vs. negative zero.
|
|
|
+* Scaling by the Base:: A useful exact floating-point operation.
|
|
|
+* Rounding Control:: Specifying some rounding behaviors.
|
|
|
+* Machine Epsilon:: The smallest number you can add to 1.0
|
|
|
+ and get a sum which is larger than 1.0.
|
|
|
+* Complex Arithmetic:: Details of arithmetic with complex numbers.
|
|
|
+* Round-Trip Base Conversion:: What happens between base-2 and base-10.
|
|
|
+* Further Reading:: References for floating-point numbers.
|
|
|
+
|
|
|
+Directing Compilation
|
|
|
+
|
|
|
+* Pragmas:: Controling compilation of some constructs.
|
|
|
+* Static Assertions:: Compile-time tests for conditions.
|
|
|
+
|
|
|
+@end detailmenu
|
|
|
+@end menu
|
|
|
+
|
|
|
+@node The First Example
|
|
|
+@chapter The First Example
|
|
|
+
|
|
|
+This chapter presents the source code for a very simple C program and
|
|
|
+uses it to explain a few features of the language. If you already
|
|
|
+know the basic points of C presented in this chapter, you can skim it
|
|
|
+or skip it.
|
|
|
+
|
|
|
+@menu
|
|
|
+* Recursive Fibonacci:: Writing a simple function recursively.
|
|
|
+* Stack:: Each function call uses space in the stack.
|
|
|
+* Iterative Fibonacci:: Writing the same function iteratively.
|
|
|
+@end menu
|
|
|
+
|
|
|
+@node Recursive Fibonacci
|
|
|
+@section Example: Recursive Fibonacci
|
|
|
+@cindex recursive Fibonacci function
|
|
|
+@cindex Fibonacci function, recursive
|
|
|
+
|
|
|
+To introduce the most basic features of C, let's look at code for a
|
|
|
+simple mathematical function that does calculations on integers. This
|
|
|
+function calculates the @var{n}th number in the Fibonacci series, in
|
|
|
+which each number is the sum of the previous two: 1, 1, 2, 3, 5, 8,
|
|
|
+13, 21, 34, 55, @dots{}.
|
|
|
+
|
|
|
+@example
|
|
|
+int
|
|
|
+fib (int n)
|
|
|
+@{
|
|
|
+ if (n <= 2) /* @r{This avoids infinite recursion.} */
|
|
|
+ return 1;
|
|
|
+ else
|
|
|
+ return fib (n - 1) + fib (n - 2);
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+This very simple program illustrates several features of C:
|
|
|
+
|
|
|
+@itemize @bullet
|
|
|
+@item
|
|
|
+A function definition, whose first two lines constitute the function
|
|
|
+header. @xref{Function Definitions}.
|
|
|
+
|
|
|
+@item
|
|
|
+A function parameter @code{n}, referred to as the variable @code{n}
|
|
|
+inside the function body. @xref{Function Parameter Variables}.
|
|
|
+A function definition uses parameters to refer to the argument
|
|
|
+values provided in a call to that function.
|
|
|
+
|
|
|
+@item
|
|
|
+Arithmetic. C programs add with @samp{+} and subtract with
|
|
|
+@samp{-}. @xref{Arithmetic}.
|
|
|
+
|
|
|
+@item
|
|
|
+Numeric comparisons. The operator @samp{<=} tests for ``less than or
|
|
|
+equal.'' @xref{Numeric Comparisons}.
|
|
|
+
|
|
|
+@item
|
|
|
+Integer constants written in base 10.
|
|
|
+@xref{Integer Constants}.
|
|
|
+
|
|
|
+@item
|
|
|
+A function call. The function call @code{fib (n - 1)} calls the
|
|
|
+function @code{fib}, passing as its argument the value @code{n - 1}.
|
|
|
+@xref{Function Calls}.
|
|
|
+
|
|
|
+@item
|
|
|
+A comment, which starts with @samp{/*} and ends with @samp{*/}. The
|
|
|
+comment has no effect on the execution of the program. Its purpose is
|
|
|
+to provide explanations to people reading the source code. Including
|
|
|
+comments in the code is tremendously important---they provide
|
|
|
+background information so others can understand the code more quickly.
|
|
|
+@xref{Comments}.
|
|
|
+
|
|
|
+@item
|
|
|
+Two kinds of statements, the @code{return} statement and the
|
|
|
+@code{if}@dots{}@code{else} statement. @xref{Statements}.
|
|
|
+
|
|
|
+@item
|
|
|
+Recursion. The function @code{fib} calls itself; that is called a
|
|
|
+@dfn{recursive call}. These are valid in C, and quite common.
|
|
|
+
|
|
|
+The @code{fib} function would not be useful if it didn't return.
|
|
|
+Thus, recursive definitions, to be of any use, must avoid infinite
|
|
|
+recursion.
|
|
|
+
|
|
|
+This function definition prevents infinite recursion by specially
|
|
|
+handling the case where @code{n} is two or less. Thus the maximum
|
|
|
+depth of recursive calls is less than @code{n}.
|
|
|
+@end itemize
|
|
|
+
|
|
|
+@menu
|
|
|
+* Function Header:: The function's name and how it is called.
|
|
|
+* Function Body:: Declarations and statements that implement the function.
|
|
|
+@end menu
|
|
|
+
|
|
|
+@node Function Header
|
|
|
+@subsection Function Header
|
|
|
+@cindex function header
|
|
|
+
|
|
|
+In our example, the first two lines of the function definition are the
|
|
|
+@dfn{header}. Its purpose is to state the function's name and say how
|
|
|
+it is called:
|
|
|
+
|
|
|
+@example
|
|
|
+int
|
|
|
+fib (int n)
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+says that the function returns an integer (type @code{int}), its name is
|
|
|
+@code{fib}, and it takes one argument named @code{n} which is also an
|
|
|
+integer. (Data types will be explained later, in @ref{Primitive Types}.)
|
|
|
+
|
|
|
+@node Function Body
|
|
|
+@subsection Function Body
|
|
|
+@cindex function body
|
|
|
+@cindex recursion
|
|
|
+
|
|
|
+The rest of the function definition is called the @dfn{function body}.
|
|
|
+Like every function body, this one starts with @samp{@{}, ends with
|
|
|
+@samp{@}}, and contains zero or more @dfn{statements} and
|
|
|
+@dfn{declarations}. Statements specify actions to take, whereas
|
|
|
+declarations define names of variables, functions, and so on. Each
|
|
|
+statement and each declaration ends with a semicolon (@samp{;}).
|
|
|
+
|
|
|
+Statements and declarations often contain @dfn{expressions}; an
|
|
|
+expression is a construct whose execution produces a @dfn{value} of
|
|
|
+some data type, but may also take actions through ``side effects''
|
|
|
+that alter subsequent execution. A statement, by contrast, does not
|
|
|
+have a value; it affects further execution of the program only through
|
|
|
+the actions it takes.
|
|
|
+
|
|
|
+This function body contains no declarations, and just one statement,
|
|
|
+but that one is a complex statement in that it contains nested
|
|
|
+statements. This function uses two kinds of statements:
|
|
|
+
|
|
|
+@table @code
|
|
|
+@item return
|
|
|
+The @code{return} statement makes the function return immediately.
|
|
|
+It looks like this:
|
|
|
+
|
|
|
+@example
|
|
|
+return @var{value};
|
|
|
+@end example
|
|
|
+
|
|
|
+Its meaning is to compute the expression @var{value} and exit the
|
|
|
+function, making it return whatever value that expression produced.
|
|
|
+For instance,
|
|
|
+
|
|
|
+@example
|
|
|
+return 1;
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+returns the integer 1 from the function, and
|
|
|
+
|
|
|
+@example
|
|
|
+return fib (n - 1) + fib (n - 2);
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+returns a value computed by performing two function calls
|
|
|
+as specified and adding their results.
|
|
|
+
|
|
|
+@item @code{if}@dots{}@code{else}
|
|
|
+The @code{if}@dots{}@code{else} statement is a @dfn{conditional}.
|
|
|
+Each time it executes, it chooses one of its two substatements to execute
|
|
|
+and ignores the other. It looks like this:
|
|
|
+
|
|
|
+@example
|
|
|
+if (@var{condition})
|
|
|
+ @var{if-true-statement}
|
|
|
+else
|
|
|
+ @var{if-false-statement}
|
|
|
+@end example
|
|
|
+
|
|
|
+Its meaning is to compute the expression @var{condition} and, if it's
|
|
|
+``true,'' execute @var{if-true-statement}. Otherwise, execute
|
|
|
+@var{if-false-statement}. @xref{if-else Statement}.
|
|
|
+
|
|
|
+Inside the @code{if}@dots{}@code{else} statement, @var{condition} is
|
|
|
+simply an expression. It's considered ``true'' if its value is
|
|
|
+nonzero. (A comparison operation, such as @code{n <= 2}, produces the
|
|
|
+value 1 if it's ``true'' and 0 if it's ``false.'' @xref{Numeric
|
|
|
+Comparisons}.) Thus,
|
|
|
+
|
|
|
+@example
|
|
|
+if (n <= 2)
|
|
|
+ return 1;
|
|
|
+else
|
|
|
+ return fib (n - 1) + fib (n - 2);
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+first tests whether the value of @code{n} is less than or equal to 2.
|
|
|
+If so, the expression @code{n <= 2} has the value 1. So execution
|
|
|
+continues with the statement
|
|
|
+
|
|
|
+@example
|
|
|
+return 1;
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+Otherwise, execution continues with this statement:
|
|
|
+
|
|
|
+@example
|
|
|
+return fib (n - 1) + fib (n - 2);
|
|
|
+@end example
|
|
|
+
|
|
|
+Each of these statements ends the execution of the function and
|
|
|
+provides a value for it to return. @xref{return Statement}.
|
|
|
+@end table
|
|
|
+
|
|
|
+Calculating @code{fib} using ordinary integers in C works only for
|
|
|
+@var{n} < 47, because the value of @code{fib (47)} is too large to fit
|
|
|
+in type @code{int}. The addition operation that tries to add
|
|
|
+@code{fib (46)} and @code{fib (45)} cannot deliver the correct result.
|
|
|
+This occurrence is called @dfn{integer overflow}.
|
|
|
+
|
|
|
+Overflow can manifest itself in various ways, but one thing that can't
|
|
|
+possibly happen is to produce the correct value, since that can't fit
|
|
|
+in the space for the value. @xref{Integer Overflow}.
|
|
|
+
|
|
|
+@xref{Functions}, for a full explanation about functions.
|
|
|
+
|
|
|
+@node Stack
|
|
|
+@section The Stack, And Stack Overflow
|
|
|
+@cindex stack
|
|
|
+@cindex stack frame
|
|
|
+@cindex stack overflow
|
|
|
+@cindex recursion, drawbacks of
|
|
|
+
|
|
|
+@cindex stack frame
|
|
|
+Recursion has a drawback: there are limits to how many nested function
|
|
|
+calls a program can make. In C, each function call allocates a block
|
|
|
+of memory which it uses until the call returns. C allocates these
|
|
|
+blocks consecutively within a large area of memory known as the
|
|
|
+@dfn{stack}, so we refer to the blocks as @dfn{stack frames}.
|
|
|
+
|
|
|
+The size of the stack is limited; if the program tries to use too
|
|
|
+much, that causes the program to fail because the stack is full. This
|
|
|
+is called @dfn{stack overflow}.
|
|
|
+
|
|
|
+@cindex crash
|
|
|
+@cindex segmentation fault
|
|
|
+Stack overflow on GNU/Linux typically manifests itself as the
|
|
|
+@dfn{signal} named @code{SIGSEGV}, also known as a ``segmentation
|
|
|
+fault.'' By default, this signal terminates the program immediately,
|
|
|
+rather than letting the program try to recover, or reach an expected
|
|
|
+ending point. (We commonly say in this case that the program
|
|
|
+``crashes''). @xref{Signals}.
|
|
|
+
|
|
|
+It is inconvenient to observe a crash by passing too large
|
|
|
+an argument to recursive Fibonacci, because the program would run a
|
|
|
+long time before it crashes. This algorithm is simple but
|
|
|
+ridiculously slow: in calculating @code{fib (@var{n})}, the number of
|
|
|
+(recursive) calls @code{fib (1)} or @code{fib (2)} that it makes equals
|
|
|
+the final result.
|
|
|
+
|
|
|
+However, you can observe stack overflow very quickly if you use
|
|
|
+this function instead:
|
|
|
+
|
|
|
+@example
|
|
|
+int
|
|
|
+fill_stack (int n)
|
|
|
+@{
|
|
|
+ if (n <= 1) /* @r{This limits the depth of recursion.} */
|
|
|
+ return 1;
|
|
|
+ else
|
|
|
+ return fill_stack (n - 1);
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+Under gNewSense GNU/Linux on the Lemote Yeeloong, without optimization
|
|
|
+and using the default configuration, an experiment showed there is
|
|
|
+enough stack space to do 261906 nested calls to that function. One
|
|
|
+more, and the stack overflows and the program crashes. On another
|
|
|
+platform, with a different configuration, or with a different
|
|
|
+function, the limit might be bigger or smaller.
|
|
|
+
|
|
|
+@node Iterative Fibonacci
|
|
|
+@section Example: Iterative Fibonacci
|
|
|
+@cindex iterative Fibonacci function
|
|
|
+@cindex Fibonacci function, iterative
|
|
|
+
|
|
|
+Here's a much faster algorithm for computing the same Fibonacci
|
|
|
+series. It is faster for two reasons. First, it uses @dfn{iteration}
|
|
|
+(that is, repetition or looping) rather than recursion, so it doesn't
|
|
|
+take time for a large number of function calls. But mainly, it is
|
|
|
+faster because the number of repetitions is small---only @code{@var{n}}.
|
|
|
+
|
|
|
+@c If you change this, change the duplicate in node Example of for.
|
|
|
+
|
|
|
+@example
|
|
|
+int
|
|
|
+fib (int n)
|
|
|
+@{
|
|
|
+ int last = 1; /* @r{Initial value is @code{fib (1)}.} */
|
|
|
+ int prev = 0; /* @r{Initial value controls @code{fib (2)}.} */
|
|
|
+ int i;
|
|
|
+
|
|
|
+ for (i = 1; i < n; ++i)
|
|
|
+ /* @r{If @code{n} is 1 or less, the loop runs zero times,} */
|
|
|
+ /* @r{since @code{i < n} is false the first time.} */
|
|
|
+ @{
|
|
|
+ /* @r{Now @code{last} is @code{fib (@code{i})}}
|
|
|
+ @r{and @code{prev} is @code{fib (@code{i} @minus{} 1)}.} */
|
|
|
+ /* @r{Compute @code{fib (@code{i} + 1)}.} */
|
|
|
+ int next = prev + last;
|
|
|
+ /* @r{Shift the values down.} */
|
|
|
+ prev = last;
|
|
|
+ last = next;
|
|
|
+ /* @r{Now @code{last} is @code{fib (@code{i} + 1)}}
|
|
|
+ @r{and @code{prev} is @code{fib (@code{i})}.}
|
|
|
+ @r{But that won't stay true for long,}
|
|
|
+ @r{because we are about to increment @code{i}.} */
|
|
|
+ @}
|
|
|
+
|
|
|
+ return last;
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+This definition computes @code{fib (@var{n})} in a time proportional
|
|
|
+to @code{@var{n}}. The comments in the definition explain how it works: it
|
|
|
+advances through the series, always keeps the last two values in
|
|
|
+@code{last} and @code{prev}, and adds them to get the next value.
|
|
|
+
|
|
|
+Here are the additional C features that this definition uses:
|
|
|
+
|
|
|
+@table @asis
|
|
|
+@item Internal blocks
|
|
|
+Within a function, wherever a statement is called for, you can write a
|
|
|
+@dfn{block}. It looks like @code{@{ @r{@dots{}} @}} and contains zero or
|
|
|
+more statements and declarations. (You can also use additional
|
|
|
+blocks as statements in a block.)
|
|
|
+
|
|
|
+The function body also counts as a block, which is why it can contain
|
|
|
+statements and declarations.
|
|
|
+
|
|
|
+@xref{Blocks}.
|
|
|
+
|
|
|
+@item Declarations of local variables
|
|
|
+This function body contains declarations as well as statements. There
|
|
|
+are three declarations directly in the function body, as well as a
|
|
|
+fourth declaration in an internal block. Each starts with @code{int}
|
|
|
+because it declares a variable whose type is integer. One declaration
|
|
|
+can declare several variables, but each of these declarations is
|
|
|
+simple and declares just one variable.
|
|
|
+
|
|
|
+Variables declared inside a block (either a function body or an
|
|
|
+internal block) are @dfn{local variables}. These variables exist only
|
|
|
+within that block; their names are not defined outside the block, and
|
|
|
+exiting the block deallocates their storage. This example declares
|
|
|
+four local variables: @code{last}, @code{prev}, @code{i}, and
|
|
|
+@code{next}.
|
|
|
+
|
|
|
+The most basic local variable declaration looks like this:
|
|
|
+
|
|
|
+@example
|
|
|
+@var{type} @var{variablename};
|
|
|
+@end example
|
|
|
+
|
|
|
+For instance,
|
|
|
+
|
|
|
+@example
|
|
|
+int i;
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+declares the local variable @code{i} as an integer.
|
|
|
+@xref{Variable Declarations}.
|
|
|
+
|
|
|
+@item Initializers
|
|
|
+When you declare a variable, you can also specify its initial value,
|
|
|
+like this:
|
|
|
+
|
|
|
+@example
|
|
|
+@var{type} @var{variablename} = @var{value};
|
|
|
+@end example
|
|
|
+
|
|
|
+For instance,
|
|
|
+
|
|
|
+@example
|
|
|
+int last = 1;
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+declares the local variable @code{last} as an integer (type
|
|
|
+@code{int}) and starts it off with the value 1. @xref{Initializers}.
|
|
|
+
|
|
|
+@item Assignment
|
|
|
+Assignment: a specific kind of expression, written with the @samp{=}
|
|
|
+operator, that stores a new value in a variable or other place. Thus,
|
|
|
+
|
|
|
+@example
|
|
|
+@var{variable} = @var{value}
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+is an expression that computes @code{@var{value}} and stores the value in
|
|
|
+@code{@var{variable}}. @xref{Assignment Expressions}.
|
|
|
+
|
|
|
+@item Expression statements
|
|
|
+An expression statement is an expression followed by a semicolon.
|
|
|
+That computes the value of the expression, then ignores the value.
|
|
|
+
|
|
|
+An expression statement is useful when the expression changes some
|
|
|
+data or has other side effects---for instance, with function calls, or
|
|
|
+with assignments as in this example. @xref{Expression Statement}.
|
|
|
+
|
|
|
+Using an expression with no side effects in an expression statement is
|
|
|
+pointless except in very special cases. For instance, the expression
|
|
|
+statement @code{x;} would examine the value of @code{x} and ignore it.
|
|
|
+That is not useful.
|
|
|
+
|
|
|
+@item Increment operator
|
|
|
+The increment operator is @samp{++}. @code{++i} is an
|
|
|
+expression that is short for @code{i = i + 1}.
|
|
|
+@xref{Increment/Decrement}.
|
|
|
+
|
|
|
+@item @code{for} statements
|
|
|
+A @code{for} statement is a clean way of executing a statement
|
|
|
+repeatedly---a @dfn{loop} (@pxref{Loop Statements}). Specifically,
|
|
|
+
|
|
|
+@example
|
|
|
+for (i = 1; i < n; ++i)
|
|
|
+ @var{body}
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+means to start by doing @code{i = 1} (set @code{i} to one) to prepare
|
|
|
+for the loop. The loop itself consists of
|
|
|
+
|
|
|
+@itemize @bullet
|
|
|
+@item
|
|
|
+Testing @code{i < n} and exiting the loop if that's false.
|
|
|
+
|
|
|
+@item
|
|
|
+Executing @var{body}.
|
|
|
+
|
|
|
+@item
|
|
|
+Advancing the loop (executing @code{++i}, which increments @code{i}).
|
|
|
+@end itemize
|
|
|
+
|
|
|
+The net result is to execute @var{body} with 0 in @code{i},
|
|
|
+then with 1 in @code{i}, and so on, stopping just before the repetition
|
|
|
+where @code{i} would equal @code{n}.
|
|
|
+
|
|
|
+The body of the @code{for} statement must be one and only one
|
|
|
+statement. You can't write two statements in a row there; if you try
|
|
|
+to, only the first of them will be treated as part of the loop.
|
|
|
+
|
|
|
+The way to put multiple statements in those places is to group them
|
|
|
+with a block, and that's what we do in this example.
|
|
|
+@end table
|
|
|
+
|
|
|
+@node Complete Program
|
|
|
+@chapter A Complete Program
|
|
|
+@cindex complete example program
|
|
|
+@cindex example program, complete
|
|
|
+
|
|
|
+It's all very well to write a Fibonacci function, but you cannot run
|
|
|
+it by itself. It is a useful program, but it is not a complete
|
|
|
+program.
|
|
|
+
|
|
|
+In this chapter we present a complete program that contains the
|
|
|
+@code{fib} function. This example shows how to make the program
|
|
|
+start, how to make it finish, how to do computation, and how to print
|
|
|
+a result.
|
|
|
+
|
|
|
+@menu
|
|
|
+* Complete Example:: Turn the simple function into a full program.
|
|
|
+* Complete Explanation:: Explanation of each part of the example.
|
|
|
+* Complete Line-by-Line:: Explaining each line of the example.
|
|
|
+* Compile Example:: Using GCC to compile the example.
|
|
|
+@end menu
|
|
|
+
|
|
|
+@node Complete Example
|
|
|
+@section Complete Program Example
|
|
|
+
|
|
|
+Here is the complete program that uses the simple, recursive version
|
|
|
+of the @code{fib} function (@pxref{Recursive Fibonacci}):
|
|
|
+
|
|
|
+@example
|
|
|
+#include <stdio.h>
|
|
|
+
|
|
|
+int
|
|
|
+fib (int n)
|
|
|
+@{
|
|
|
+ if (n <= 2) /* @r{This avoids infinite recursion.} */
|
|
|
+ return 1;
|
|
|
+ else
|
|
|
+ return fib (n - 1) + fib (n - 2);
|
|
|
+@}
|
|
|
+
|
|
|
+int
|
|
|
+main (void)
|
|
|
+@{
|
|
|
+ printf ("Fibonacci series item %d is %d\n",
|
|
|
+ 20, fib (20));
|
|
|
+ return 0;
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+This program prints a message that shows the value of @code{fib (20)}.
|
|
|
+
|
|
|
+Now for an explanation of what that code means.
|
|
|
+
|
|
|
+@node Complete Explanation
|
|
|
+@section Complete Program Explanation
|
|
|
+
|
|
|
+@ifnottex
|
|
|
+Here's the explanation of the code of the example in the
|
|
|
+previous section.
|
|
|
+@end ifnottex
|
|
|
+
|
|
|
+This sample program prints a message that shows the value of @code{fib
|
|
|
+(20)}, and exits with code 0 (which stands for successful execution).
|
|
|
+
|
|
|
+Every C program is started by running the function named @code{main}.
|
|
|
+Therefore, the example program defines a function named @code{main} to
|
|
|
+provide a way to start it. Whatever that function does is what the
|
|
|
+program does. @xref{The main Function}.
|
|
|
+
|
|
|
+The @code{main} function is the first one called when the program
|
|
|
+runs, but it doesn't come first in the example code. The order of the
|
|
|
+function definitions in the source code makes no difference to the
|
|
|
+program's meaning.
|
|
|
+
|
|
|
+The initial call to @code{main} always passes certain arguments, but
|
|
|
+@code{main} does not have to pay attention to them. To ignore those
|
|
|
+arguments, define @code{main} with @code{void} as the parameter list.
|
|
|
+(@code{void} as a function's parameter list normally means ``call with
|
|
|
+no arguments,'' but @code{main} is a special case.)
|
|
|
+
|
|
|
+The function @code{main} returns 0 because that is
|
|
|
+the conventional way for @code{main} to indicate successful execution.
|
|
|
+It could instead return a positive integer to indicate failure, and
|
|
|
+some utility programs have specific conventions for the meaning of
|
|
|
+certain numeric @dfn{failure codes}. @xref{Values from main}.
|
|
|
+
|
|
|
+@cindex @code{printf}
|
|
|
+The simplest way to print text in C is by calling the @code{printf}
|
|
|
+function, so here we explain what that does.
|
|
|
+
|
|
|
+@cindex standard output
|
|
|
+The first argument to @code{printf} is a @dfn{string constant}
|
|
|
+(@pxref{String Constants}) that is a template for output. The
|
|
|
+function @code{printf} copies most of that string directly as output,
|
|
|
+including the newline character at the end of the string, which is
|
|
|
+written as @samp{\n}. The output goes to the program's @dfn{standard
|
|
|
+output} destination, which in the usual case is the terminal.
|
|
|
+
|
|
|
+@samp{%} in the template introduces a code that substitutes other text
|
|
|
+into the output. Specifically, @samp{%d} means to take the next
|
|
|
+argument to @code{printf} and substitute it into the text as a decimal
|
|
|
+number. (The argument for @samp{%d} must be of type @code{int}; if it
|
|
|
+isn't, @code{printf} will malfunction.) So the output is a line that
|
|
|
+looks like this:
|
|
|
+
|
|
|
+@example
|
|
|
+Fibonacci series item 20 is 6765
|
|
|
+@end example
|
|
|
+
|
|
|
+This program does not contain a definition for @code{printf} because
|
|
|
+it is defined by the C library, which makes it available in all C
|
|
|
+programs. However, each program does need to @dfn{declare}
|
|
|
+@code{printf} so it will be called correctly. The @code{#include}
|
|
|
+line takes care of that; it includes a @dfn{header file} called
|
|
|
+@file{stdio.h} into the program's code. That file is provided by the
|
|
|
+operating system and it contains declarations for the many standard
|
|
|
+input/output functions in the C library, one of which is
|
|
|
+@code{printf}.
|
|
|
+
|
|
|
+Don't worry about header files for now; we'll explain them later in
|
|
|
+@ref{Header Files}.
|
|
|
+
|
|
|
+The first argument of @code{printf} does not have to be a string
|
|
|
+constant; it can be any string (@pxref{Strings}). However, using a
|
|
|
+constant is the most common case.
|
|
|
+
|
|
|
+To learn more about @code{printf} and other facilities of the C
|
|
|
+library, see @ref{Top, The GNU C Library, , libc, The GNU C Library
|
|
|
+Reference Manual}.
|
|
|
+
|
|
|
+@node Complete Line-by-Line
|
|
|
+@section Complete Program, Line by Line
|
|
|
+
|
|
|
+Here's the same example, explained line by line.
|
|
|
+@strong{Beginners, do you find this helpful or not?
|
|
|
+Would you prefer a different layout for the example?
|
|
|
+Please tell rms@@gnu.org.}
|
|
|
+
|
|
|
+@example
|
|
|
+#include <stdio.h> /* @r{Include declaration of usual} */
|
|
|
+ /* @r{I/O functions such as @code{printf}.} */
|
|
|
+ /* @r{Most programs need these.} */
|
|
|
+
|
|
|
+int /* @r{This function returns an @code{int}.} */
|
|
|
+fib (int n) /* @r{Its name is @code{fib};} */
|
|
|
+ /* @r{its argument is called @code{n}.} */
|
|
|
+@{ /* @r{Start of function body.} */
|
|
|
+ /* @r{This stops the recursion from being infinite.} */
|
|
|
+ if (n <= 2) /* @r{If @code{n} is 1 or 2,} */
|
|
|
+ return 1; /* @r{make @code{fib} return 1.} */
|
|
|
+ else /* @r{otherwise, add the two previous} */
|
|
|
+ /* @r{fibonacci numbers.} */
|
|
|
+ return fib (n - 1) + fib (n - 2);
|
|
|
+@}
|
|
|
+
|
|
|
+int /* @r{This function returns an @code{int}.} */
|
|
|
+main (void) /* @r{Start here; ignore arguments.} */
|
|
|
+@{ /* @r{Print message with numbers in it.} */
|
|
|
+ printf ("Fibonacci series item %d is %d\n",
|
|
|
+ 20, fib (20));
|
|
|
+ return 0; /* @r{Terminate program, report success.} */
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+@node Compile Example
|
|
|
+@section Compiling the Example Program
|
|
|
+@cindex compiling
|
|
|
+@cindex executable file
|
|
|
+
|
|
|
+To run a C program requires converting the source code into an
|
|
|
+@dfn{executable file}. This is called @dfn{compiling} the program,
|
|
|
+and the command to do that using GNU C is @command{gcc}.
|
|
|
+
|
|
|
+This example program consists of a single source file. If we
|
|
|
+call that file @file{fib1.c}, the complete command to compile it is
|
|
|
+this:
|
|
|
+
|
|
|
+@example
|
|
|
+gcc -g -O -o fib1 fib1.c
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+Here, @option{-g} says to generate debugging information, @option{-O}
|
|
|
+says to optimize at the basic level, and @option{-o fib1} says to put
|
|
|
+the executable program in the file @file{fib1}.
|
|
|
+
|
|
|
+To run the program, use its file name as a shell command.
|
|
|
+For instance,
|
|
|
+
|
|
|
+@example
|
|
|
+./fib1
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+However, unless you are sure the program is correct, you should
|
|
|
+expect to need to debug it. So use this command,
|
|
|
+
|
|
|
+@example
|
|
|
+gdb fib1
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+which starts the GDB debugger (@pxref{Sample Session, Sample Session,
|
|
|
+A Sample GDB Session, gdb, Debugging with GDB}) so you can run and
|
|
|
+debug the executable program @code{fib1}.
|
|
|
+
|
|
|
+
|
|
|
+@xref{Compilation}, for an introduction to compiling more complex
|
|
|
+programs which consist of more than one source file.
|
|
|
+
|
|
|
+@node Storage
|
|
|
+@chapter Storage and Data
|
|
|
+@cindex bytes
|
|
|
+@cindex storage organization
|
|
|
+@cindex memory organization
|
|
|
+
|
|
|
+Storage in C programs is made up of units called @dfn{bytes}. On
|
|
|
+nearly all computers, a byte consists of 8 bits, but there are a few
|
|
|
+peculiar computers (mostly ``embedded controllers'' for very small
|
|
|
+systems) where a byte is longer than that. This manual does not try
|
|
|
+to explain the peculiarity of those computers; we assume that a byte
|
|
|
+is 8 bits.
|
|
|
+
|
|
|
+Every C data type is made up of a certain number of bytes; that number
|
|
|
+is the data type's @dfn{size}. @xref{Type Size}, for details. The
|
|
|
+types @code{signed char} and @code{unsigned char} are one byte long;
|
|
|
+use those types to operate on data byte by byte. @xref{Signed and
|
|
|
+Unsigned Types}. You can refer to a series of consecutive bytes as an
|
|
|
+array of @code{char} elements; that's what an ASCII string looks like
|
|
|
+in memory. @xref{String Constants}.
|
|
|
+
|
|
|
+@node Beyond Integers
|
|
|
+@chapter Beyond Integers
|
|
|
+
|
|
|
+So far we've presented programs that operate on integers. In this
|
|
|
+chapter we'll present examples of handling non-integral numbers and
|
|
|
+arrays of numbers.
|
|
|
+
|
|
|
+@menu
|
|
|
+* Float Example:: A function that uses floating-point numbers.
|
|
|
+* Array Example:: A function that works with arrays.
|
|
|
+* Array Example Call:: How to call that function.
|
|
|
+* Array Example Variations:: Different ways to write the call example.
|
|
|
+@end menu
|
|
|
+
|
|
|
+@node Float Example
|
|
|
+@section An Example with Non-Integer Numbers
|
|
|
+@cindex floating point example
|
|
|
+
|
|
|
+Here's a function that operates on and returns @dfn{floating point}
|
|
|
+numbers that don't have to be integers. Floating point represents a
|
|
|
+number as a fraction together with a power of 2. (For more detail,
|
|
|
+@pxref{Floating-Point Data Types}.) This example calculates the
|
|
|
+average of three floating point numbers that are passed to it as
|
|
|
+arguments:
|
|
|
+
|
|
|
+@example
|
|
|
+double
|
|
|
+average_of_three (double a, double b, double c)
|
|
|
+@{
|
|
|
+ return (a + b + c) / 3;
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+The values of the parameter @var{a}, @var{b} and @var{c} do not have to be
|
|
|
+integers, and even when they happen to be integers, most likely their
|
|
|
+average is not an integer.
|
|
|
+
|
|
|
+@code{double} is the usual data type in C for calculations on
|
|
|
+floating-point numbers.
|
|
|
+
|
|
|
+To print a @code{double} with @code{printf}, we must use @samp{%f}
|
|
|
+instead of @samp{%d}:
|
|
|
+
|
|
|
+@example
|
|
|
+printf ("Average is %f\n",
|
|
|
+ average_of_three (1.1, 9.8, 3.62));
|
|
|
+@end example
|
|
|
+
|
|
|
+The code that calls @code{printf} must pass a @code{double} for
|
|
|
+printing with @samp{%f} and an @code{int} for printing with @samp{%d}.
|
|
|
+If the argument has the wrong type, @code{printf} will produce garbage
|
|
|
+output.
|
|
|
+
|
|
|
+Here's a complete program that computes the average of three
|
|
|
+specific numbers and prints the result:
|
|
|
+
|
|
|
+@example
|
|
|
+double
|
|
|
+average_of_three (double a, double b, double c)
|
|
|
+@{
|
|
|
+ return (a + b + c) / 3;
|
|
|
+@}
|
|
|
+
|
|
|
+int
|
|
|
+main (void)
|
|
|
+@{
|
|
|
+ printf ("Average is %f\n",
|
|
|
+ average_of_three (1.1, 9.8, 3.62));
|
|
|
+ return 0;
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+From now on we will not present examples of calls to @code{main}.
|
|
|
+Instead we encourage you to write them for yourself when you want
|
|
|
+to test executing some code.
|
|
|
+
|
|
|
+@node Array Example
|
|
|
+@section An Example with Arrays
|
|
|
+@cindex array example
|
|
|
+
|
|
|
+A function to take the average of three numbers is very specific and
|
|
|
+limited. A more general function would take the average of any number
|
|
|
+of numbers. That requires passing the numbers in an array. An array
|
|
|
+is an object in memory that contains a series of values of the same
|
|
|
+data type. This chapter presents the basic concepts and use of arrays
|
|
|
+through an example; for the full explanation, see @ref{Arrays}.
|
|
|
+
|
|
|
+Here's a function definition to take the average of several
|
|
|
+floating-point numbers, passed as type @code{double}. The first
|
|
|
+parameter, @code{length}, specifies how many numbers are passed. The
|
|
|
+second parameter, @code{input_data}, is an array that holds those
|
|
|
+numbers.
|
|
|
+
|
|
|
+@example
|
|
|
+double
|
|
|
+avg_of_double (int length, double input_data[])
|
|
|
+@{
|
|
|
+ double sum = 0;
|
|
|
+ int i;
|
|
|
+
|
|
|
+ for (i = 0; i < length; i++)
|
|
|
+ sum = sum + input_data[i];
|
|
|
+
|
|
|
+ return sum / length;
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+This introduces the expression to refer to an element of an array:
|
|
|
+@code{input_data[i]} means the element at index @code{i} in
|
|
|
+@code{input_data}. The index of the element can be any expression
|
|
|
+with an integer value; in this case, the expression is @code{i}.
|
|
|
+@xref{Accessing Array Elements}.
|
|
|
+
|
|
|
+@cindex zero-origin indexing
|
|
|
+The lowest valid index in an array is 0, @emph{not} 1, and the highest
|
|
|
+valid index is one less than the number of elements. (This is known
|
|
|
+as @dfn{zero-origin indexing}.)
|
|
|
+
|
|
|
+This example also introduces the way to declare that a function
|
|
|
+parameter is an array. Such declarations are modeled after the syntax
|
|
|
+for an element of the array. Just as @code{double foo} declares that
|
|
|
+@code{foo} is of type @code{double}, @code{double input_data[]}
|
|
|
+declares that each element of @code{input_data} is of type
|
|
|
+@code{double}. Therefore, @code{input_data} itself has type ``array
|
|
|
+of @code{double}.''
|
|
|
+
|
|
|
+When declaring an array parameter, it's not necessary to say how long
|
|
|
+the array is. In this case, the parameter @code{input_data} has no
|
|
|
+length information. That's why the function needs another parameter,
|
|
|
+@code{length}, for the caller to provide that information to the
|
|
|
+function @code{avg_of_double}.
|
|
|
+
|
|
|
+@node Array Example Call
|
|
|
+@section Calling the Array Example
|
|
|
+
|
|
|
+To call the function @code{avg_of_double} requires making an
|
|
|
+array and then passing it as an argument. Here is an example.
|
|
|
+
|
|
|
+@example
|
|
|
+@{
|
|
|
+ /* @r{The array of values to average.} */
|
|
|
+ double nums_to_average[5];
|
|
|
+ /* @r{The average, once we compute it.} */
|
|
|
+ double average;
|
|
|
+
|
|
|
+ /* @r{Fill in elements of @code{nums_to_average}.} */
|
|
|
+
|
|
|
+ nums_to_average[0] = 58.7;
|
|
|
+ nums_to_average[1] = 5.1;
|
|
|
+ nums_to_average[2] = 7.7;
|
|
|
+ nums_to_average[3] = 105.2;
|
|
|
+ nums_to_average[4] = -3.14159;
|
|
|
+
|
|
|
+ average = avg_of_double (5, nums_to_average);
|
|
|
+
|
|
|
+ /* @r{@dots{}now make use of @code{average}@dots{}} */
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+This shows an array subscripting expression again, this time
|
|
|
+on the left side of an assignment, storing a value into an
|
|
|
+element of an array.
|
|
|
+
|
|
|
+It also shows how to declare a local variable that is an array:
|
|
|
+@code{double nums_to_average[5];}. Since this declaration allocates the
|
|
|
+space for the array, it needs to know the array's length. You can
|
|
|
+specify the length with any expression whose value is an integer, but
|
|
|
+in this declaration the length is a constant, the integer 5.
|
|
|
+
|
|
|
+The name of the array, when used by itself as an expression, stands
|
|
|
+for the address of the array's data, and that's what gets passed to
|
|
|
+the function @code{avg_of_double} in @code{avg_of_double (5,
|
|
|
+nums_to_average)}.
|
|
|
+
|
|
|
+We can make the code easier to maintain by avoiding the need to write
|
|
|
+5, the array length, when calling @code{avg_of_double}. That way, if
|
|
|
+we change the array to include more elements, we won't have to change
|
|
|
+that call. One way to do this is with the @code{sizeof} operator:
|
|
|
+
|
|
|
+@example
|
|
|
+ average = avg_of_double ((sizeof (nums_to_average)
|
|
|
+ / sizeof (nums_to_average[0])),
|
|
|
+ nums_to_average);
|
|
|
+@end example
|
|
|
+
|
|
|
+This computes the number of elements in @code{nums_to_average} by dividing
|
|
|
+its total size by the size of one element. @xref{Type Size}, for more
|
|
|
+details of using @code{sizeof}.
|
|
|
+
|
|
|
+We don't show in this example what happens after storing the result of
|
|
|
+@code{avg_of_double} in the variable @code{average}. Presumably
|
|
|
+more code would follow that uses that result somehow. (Why compute
|
|
|
+the average and not use it?) But that isn't part of this topic.
|
|
|
+
|
|
|
+@node Array Example Variations
|
|
|
+@section Variations for Array Example
|
|
|
+
|
|
|
+The code to call @code{avg_of_double} has two declarations that
|
|
|
+start with the same data type:
|
|
|
+
|
|
|
+@example
|
|
|
+ /* @r{The array of values to average.} */
|
|
|
+ double nums_to_average[5];
|
|
|
+ /* @r{The average, once we compute it.} */
|
|
|
+ double average;
|
|
|
+@end example
|
|
|
+
|
|
|
+In C, you can combine the two, like this:
|
|
|
+
|
|
|
+@example
|
|
|
+ double nums_to_average[5], average;
|
|
|
+@end example
|
|
|
+
|
|
|
+This declares @code{nums_to_average} so each of its elements is a
|
|
|
+@code{double}, and @code{average} so that it simply is a
|
|
|
+@code{double}.
|
|
|
+
|
|
|
+However, while you @emph{can} combine them, that doesn't mean you
|
|
|
+@emph{should}. If it is useful to write comments about the variables,
|
|
|
+and usually it is, then it's clearer to keep the declarations separate
|
|
|
+so you can put a comment on each one.
|
|
|
+
|
|
|
+We set all of the elements of the array @code{nums_to_average} with
|
|
|
+assignments, but it is more convenient to use an initializer in the
|
|
|
+declaration:
|
|
|
+
|
|
|
+@example
|
|
|
+@{
|
|
|
+ /* @r{The array of values to average.} */
|
|
|
+ double nums_to_average[]
|
|
|
+ = @{ 58.7, 5.1, 7.7, 105.2, -3.14159 @};
|
|
|
+
|
|
|
+ /* @r{The average, once we compute it.} */
|
|
|
+ average = avg_of_double ((sizeof (nums_to_average)
|
|
|
+ / sizeof (nums_to_average[0])),
|
|
|
+ nums_to_average);
|
|
|
+
|
|
|
+ /* @r{@dots{}now make use of @code{average}@dots{}} */
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+The array initializer is a comma-separated list of values, delimited
|
|
|
+by braces. @xref{Initializers}.
|
|
|
+
|
|
|
+Note that the declaration does not specify a size for
|
|
|
+@code{nums_to_average}, so the size is determined from the
|
|
|
+initializer. There are five values in the initializer, so
|
|
|
+@code{nums_to_average} gets length 5. If we add another element to
|
|
|
+the initializer, @code{nums_to_average} will have six elements.
|
|
|
+
|
|
|
+Because the code computes the number of elements from the size of
|
|
|
+the array, using @code{sizeof}, the program will operate on all the
|
|
|
+elements in the initializer, regardless of how many those are.
|
|
|
+
|
|
|
+@node Lexical Syntax
|
|
|
+@chapter Lexical Syntax
|
|
|
+@cindex lexical syntax
|
|
|
+@cindex token
|
|
|
+
|
|
|
+To start the full description of the C language, we explain the
|
|
|
+lexical syntax and lexical units of C code. The lexical units of a
|
|
|
+programming language are known as @dfn{tokens}. This chapter covers
|
|
|
+all the tokens of C except for constants, which are covered in a later
|
|
|
+chapter (@pxref{Constants}). One vital kind of token is the
|
|
|
+@dfn{identifier} (@pxref{Identifiers}), which is used for names of any
|
|
|
+kind.
|
|
|
+
|
|
|
+@menu
|
|
|
+* English:: Write programs in English!
|
|
|
+* Characters:: The characters allowed in C programs.
|
|
|
+* Whitespace:: The particulars of whitespace characters.
|
|
|
+* Comments:: How to include comments in C code.
|
|
|
+* Identifiers:: How to form identifiers (names).
|
|
|
+* Operators/Punctuation:: Characters used as operators or punctuation.
|
|
|
+* Line Continuation:: Splitting one line into multiple lines.
|
|
|
+@end menu
|
|
|
+
|
|
|
+@node English
|
|
|
+@section Write Programs in English!
|
|
|
+
|
|
|
+In principle, you can write the function and variable names in a
|
|
|
+program, and the comments, in any human language. C allows any kinds
|
|
|
+of characters in comments, and you can put non-ASCII characters into
|
|
|
+identifiers with a special prefix. However, to enable programmers in
|
|
|
+all countries to understand and develop the program, it is best given
|
|
|
+today's circumstances to write identifiers and comments in
|
|
|
+English.
|
|
|
+
|
|
|
+English is the one language that programmers in all countries
|
|
|
+generally study. If a program's names are in English, most
|
|
|
+programmers in Bangladesh, Belgium, Bolivia, Brazil, and Bulgaria can
|
|
|
+understand them. Most programmers in those countries can speak
|
|
|
+English, or at least read it, but they do not read each other's
|
|
|
+languages at all. In India, with so many languages, two programmers
|
|
|
+may have no common language other than English.
|
|
|
+
|
|
|
+If you don't feel confident in writing English, do the best you can,
|
|
|
+and follow each English comment with a version in a language you
|
|
|
+write better; add a note asking others to translate that to English.
|
|
|
+Someone will eventually do that.
|
|
|
+
|
|
|
+The program's user interface is a different matter. We don't need to
|
|
|
+choose one language for that; it is easy to support multiple languages
|
|
|
+and let each user choose the language to use. This requires writing
|
|
|
+the program to support localization of its interface. (The
|
|
|
+@code{gettext} package exists to support this; @pxref{Message
|
|
|
+Translation, The GNU C Library, , libc, The GNU C Library Reference
|
|
|
+Manual}.) Then a community-based translation effort can provide
|
|
|
+support for all the languages users want to use.
|
|
|
+
|
|
|
+@node Characters
|
|
|
+@section Characters
|
|
|
+@cindex character set
|
|
|
+@cindex Unicode
|
|
|
+
|
|
|
+@c ??? How to express ¶?
|
|
|
+
|
|
|
+GNU C source files are usually written in the
|
|
|
+@url{https://en.wikipedia.org/wiki/ASCII,,ASCII} character set, which
|
|
|
+was defined in the 1960s for English. However, they can also include
|
|
|
+Unicode characters represented in the
|
|
|
+@url{https://en.wikipedia.org/wiki/UTF-8,,UTF-8} multibyte encoding.
|
|
|
+This makes it possible to represent accented letters such as @samp{á},
|
|
|
+as well as other scripts such as Arabic, Chinese, Cyrillic, Hebrew,
|
|
|
+Japanese, and Korean.@footnote{On some obscure systems, GNU C uses
|
|
|
+UTF-EBCDIC instead of UTF-8, but that is not worth describing in this
|
|
|
+manual.}
|
|
|
+
|
|
|
+In C source code, non-ASCII characters are valid in comments, in wide
|
|
|
+character constants (@pxref{Wide Character Constants}), and in string
|
|
|
+constants (@pxref{String Constants}).
|
|
|
+
|
|
|
+@c ??? valid in identifiers?
|
|
|
+Another way to specify non-ASCII characters in constants (character or
|
|
|
+string) and identifiers is with an escape sequence starting with
|
|
|
+backslash, specifying the intended Unicode character. (@xref{Unicode
|
|
|
+Character Codes}.) This specifies non-ASCII characters without
|
|
|
+putting a real non-ASCII character in the source file itself.
|
|
|
+
|
|
|
+C accepts two-character aliases called @dfn{digraphs} for certain
|
|
|
+characters. @xref{Digraphs}.
|
|
|
+
|
|
|
+@node Whitespace
|
|
|
+@section Whitespace
|
|
|
+@cindex whitespace characters in source files
|
|
|
+@cindex space character in source
|
|
|
+@cindex tab character in source
|
|
|
+@cindex formfeed in source
|
|
|
+@cindex linefeed in source
|
|
|
+@cindex newline in source
|
|
|
+@cindex carriage return in source
|
|
|
+@cindex vertical tab in source
|
|
|
+
|
|
|
+Whitespace means characters that exist in a file but appear blank in a
|
|
|
+printed listing of a file (or traditionally did appear blank, several
|
|
|
+decades ago). The C language requires whitespace in order to separate
|
|
|
+two consecutive identifiers, or to separate an identifier from a
|
|
|
+numeric constant. Other than that, and a few special situations
|
|
|
+described later, whitespace is optional; you can put it in when you
|
|
|
+wish, to make the code easier to read.
|
|
|
+
|
|
|
+Space and tab in C code are treated as whitespace characters. So are
|
|
|
+line breaks. You can represent a line break with the newline
|
|
|
+character (also called @dfn{linefeed} or LF), CR (carriage return), or
|
|
|
+the CRLF sequence (two characters: carriage return followed by a
|
|
|
+newline character).
|
|
|
+
|
|
|
+The @dfn{formfeed} character, Control-L, was traditionally used to
|
|
|
+divide a file into pages. It is still used this way in source code,
|
|
|
+and the tools that generate nice printouts of source code still start
|
|
|
+a new page after each ``formfeed'' character. Dividing code into
|
|
|
+pages separated by formfeed characters is a good way to break it up
|
|
|
+into comprehensible pieces and show other programmers where they start
|
|
|
+and end.
|
|
|
+
|
|
|
+The @dfn{vertical tab} character, Control-K, was traditionally used to
|
|
|
+make printing advance down to the next section of a page. We know of
|
|
|
+no particular reason to use it in source code, but it is still
|
|
|
+accepted as whitespace in C.
|
|
|
+
|
|
|
+Comments are also syntactically equivalent to whitespace.
|
|
|
+@ifinfo
|
|
|
+@xref{Comments}.
|
|
|
+@end ifinfo
|
|
|
+
|
|
|
+@node Comments
|
|
|
+@section Comments
|
|
|
+@cindex comments
|
|
|
+
|
|
|
+A comment encapsulates text that has no effect on the program's
|
|
|
+execution or meaning.
|
|
|
+
|
|
|
+The purpose of comments is to explain the code to people that read it.
|
|
|
+Writing good comments for your code is tremendously important---they
|
|
|
+should provide background information that helps programmers
|
|
|
+understand the reasons why the code is written the way it is. You,
|
|
|
+returning to the code six months from now, will need the help of these
|
|
|
+comments to remember why you wrote it this way.
|
|
|
+
|
|
|
+Outdated comments that become incorrect are counterproductive, so part
|
|
|
+of the software developer's responsibility is to update comments as
|
|
|
+needed to correspond with changes to the program code.
|
|
|
+
|
|
|
+C allows two kinds of comment syntax, the traditional style and the
|
|
|
+C@t{++} style. A traditional C comment starts with @samp{/*} and ends
|
|
|
+with @samp{*/}. For instance,
|
|
|
+
|
|
|
+@example
|
|
|
+/* @r{This is a comment in traditional C syntax.} */
|
|
|
+@end example
|
|
|
+
|
|
|
+A traditional comment can contain @samp{/*}, but these delimiters do
|
|
|
+not nest as pairs. The first @samp{*/} ends the comment regardless of
|
|
|
+whether it contains @samp{/*} sequences.
|
|
|
+
|
|
|
+@example
|
|
|
+/* @r{This} /* @r{is a comment} */ But this is not! */
|
|
|
+@end example
|
|
|
+
|
|
|
+A @dfn{line comment} starts with @samp{//} and ends at the end of the line.
|
|
|
+For instance,
|
|
|
+
|
|
|
+@example
|
|
|
+// @r{This is a comment in C@t{++} style.}
|
|
|
+@end example
|
|
|
+
|
|
|
+Line comments do nest, in effect, because @samp{//} inside a line
|
|
|
+comment is part of that comment:
|
|
|
+
|
|
|
+@example
|
|
|
+// @r{this whole line is} // @r{one comment}
|
|
|
+This is code, not comment.
|
|
|
+@end example
|
|
|
+
|
|
|
+It is safe to put line comments inside block comments, or vice versa.
|
|
|
+
|
|
|
+@example
|
|
|
+@group
|
|
|
+/* @r{traditional comment}
|
|
|
+ // @r{contains line comment}
|
|
|
+ @r{more traditional comment}
|
|
|
+ */ text here is not a comment
|
|
|
+
|
|
|
+// @r{line comment} /* @r{contains traditional comment} */
|
|
|
+@end group
|
|
|
+@end example
|
|
|
+
|
|
|
+But beware of commenting out one end of a traditional comment with a line
|
|
|
+comment. The delimiter @samp{/*} doesn't start a comment if it occurs
|
|
|
+inside an already-started comment.
|
|
|
+
|
|
|
+@example
|
|
|
+@group
|
|
|
+ // @r{line comment} /* @r{That would ordinarily begin a block comment.}
|
|
|
+ Oops! The line comment has ended;
|
|
|
+ this isn't a comment any more. */
|
|
|
+@end group
|
|
|
+@end example
|
|
|
+
|
|
|
+Comments are not recognized within string constants. @t{@w{"/* blah
|
|
|
+*/"}} is the string constant @samp{@w{/* blah */}}, not an empty
|
|
|
+string.
|
|
|
+
|
|
|
+In this manual we show the text in comments in a variable-width font,
|
|
|
+for readability, but this font distinction does not exist in source
|
|
|
+files.
|
|
|
+
|
|
|
+A comment is syntactically equivalent to whitespace, so it always
|
|
|
+separates tokens. Thus,
|
|
|
+
|
|
|
+@example
|
|
|
+@group
|
|
|
+ int/* @r{comment} */foo;
|
|
|
+@r{is equivalent to}
|
|
|
+ int foo;
|
|
|
+@end group
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+but clean code always uses real whitespace to separate the comment
|
|
|
+visually from surrounding code.
|
|
|
+
|
|
|
+@node Identifiers
|
|
|
+@section Identifiers
|
|
|
+@cindex identifiers
|
|
|
+
|
|
|
+An @dfn{identifier} (name) in C is a sequence of letters and digits,
|
|
|
+as well as @samp{_}, that does not start with a digit. Most compilers
|
|
|
+also allow @samp{$}. An identifier can be as long as you like; for
|
|
|
+example,
|
|
|
+
|
|
|
+@example
|
|
|
+int anti_dis_establishment_arian_ism;
|
|
|
+@end example
|
|
|
+
|
|
|
+@cindex case of letters in identifiers
|
|
|
+Letters in identifiers are case-sensitive in C; thus, @code{a}
|
|
|
+and @code{A} are two different identifiers.
|
|
|
+
|
|
|
+@cindex keyword
|
|
|
+@cindex reserved words
|
|
|
+Identifiers in C are used as variable names, function names, typedef
|
|
|
+names, enumeration constants, type tags, field names, and labels.
|
|
|
+Certain identifiers in C are @dfn{keywords}, which means they have
|
|
|
+specific syntactic meanings. Keywords in C are @dfn{reserved words},
|
|
|
+meaning you cannot use them in any other way. For instance, you can't
|
|
|
+define a variable or function named @code{return} or @code{if}.
|
|
|
+
|
|
|
+You can also include other characters, even non-ASCII characters, in
|
|
|
+identifiers by writing their Unicode character names, which start with
|
|
|
+@samp{\u} or @samp{\U}, in the identifier name. @xref{Unicode
|
|
|
+Character Codes}. However, it is usually a bad idea to use non-ASCII
|
|
|
+characters in identifiers, and when they are written in English, they
|
|
|
+never need non-ASCII characters. @xref{English}.
|
|
|
+
|
|
|
+Whitespace is required to separate two consecutive identifiers, or to
|
|
|
+separate an identifier from a preceding or following numeric
|
|
|
+constant.
|
|
|
+
|
|
|
+@node Operators/Punctuation
|
|
|
+@section Operators and Punctuation
|
|
|
+@cindex operators
|
|
|
+@cindex punctuation
|
|
|
+
|
|
|
+Here we describe the lexical syntax of operators and punctuation in C.
|
|
|
+The specific operators of C and their meanings are presented in
|
|
|
+subsequent chapters.
|
|
|
+
|
|
|
+Most operators in C consist of one or two characters that can't be
|
|
|
+used in identifiers. The characters used for operators in C are
|
|
|
+@samp{!~^&|*/%+-=<>,.?:}.
|
|
|
+
|
|
|
+Some operators are a single character. For instance, @samp{-} is the
|
|
|
+operator for negation (with one operand) and the operator for
|
|
|
+subtraction (with two operands).
|
|
|
+
|
|
|
+Some operators are two characters. For example, @samp{++} is the
|
|
|
+increment operator. Recognition of multicharacter operators works by
|
|
|
+grouping together as many consecutive characters as can constitute one
|
|
|
+operator.
|
|
|
+
|
|
|
+For instance, the character sequence @samp{++} is always interpreted
|
|
|
+as the increment operator; therefore, if we want to write two
|
|
|
+consecutive instances of the operator @samp{+}, we must separate them
|
|
|
+with a space so that they do not combine as one token. Applying the
|
|
|
+same rule, @code{a+++++b} is always tokenized as @code{@w{a++ ++ +
|
|
|
+b}}, not as @code{@w{a++ + ++b}}, even though the latter could be part
|
|
|
+of a valid C program and the former could not (since @code{a++}
|
|
|
+is not an lvalue and thus can't be the operand of @code{++}).
|
|
|
+
|
|
|
+A few C operators are keywords rather than special characters. They
|
|
|
+include @code{sizeof} (@pxref{Type Size}) and @code{_Alignof}
|
|
|
+(@pxref{Type Alignment}).
|
|
|
+
|
|
|
+The characters @samp{;@{@}[]()} are used for punctuation and grouping.
|
|
|
+Semicolon (@samp{;}) ends a statement. Braces (@samp{@{} and
|
|
|
+@samp{@}}) begin and end a block at the statement level
|
|
|
+(@pxref{Blocks}), and surround the initializer (@pxref{Initializers})
|
|
|
+for a variable with multiple elements or components (such as arrays or
|
|
|
+structures).
|
|
|
+
|
|
|
+Square brackets (@samp{[} and @samp{]}) do array indexing, as in
|
|
|
+@code{array[5]}.
|
|
|
+
|
|
|
+Parentheses are used in expressions for explicit nesting of
|
|
|
+expressions (@pxref{Basic Arithmetic}), around the parameter
|
|
|
+declarations in a function declaration or definition, and around the
|
|
|
+arguments in a function call, as in @code{printf ("Foo %d\n", i)}
|
|
|
+(@pxref{Function Calls}). Several kinds of statements also use
|
|
|
+parentheses as part of their syntax---for instance, @code{if}
|
|
|
+statements, @code{for} statements, @code{while} statements, and
|
|
|
+@code{switch} statements. @xref{if Statement}, and following
|
|
|
+sections.
|
|
|
+
|
|
|
+Parentheses are also required around the operand of the operator
|
|
|
+keywords @code{sizeof} and @code{_Alignof} when the operand is a data
|
|
|
+type rather than a value. @xref{Type Size}.
|
|
|
+
|
|
|
+@node Line Continuation
|
|
|
+@section Line Continuation
|
|
|
+@cindex line continuation
|
|
|
+@cindex continuation of lines
|
|
|
+
|
|
|
+The sequence of a backslash and a newline is ignored absolutely
|
|
|
+anywhere in a C program. This makes it possible to split a single
|
|
|
+source line into multiple lines in the source file. GNU C tolerates
|
|
|
+and ignores other whitespace between the backslash and the newline.
|
|
|
+In particular, it always ignores a CR (carriage return) character
|
|
|
+there, in case some text editor decided to end the line with the CRLF
|
|
|
+sequence.
|
|
|
+
|
|
|
+The main use of line continuation in C is for macro definitions that
|
|
|
+would be inconveniently long for a single line (@pxref{Macros}).
|
|
|
+
|
|
|
+It is possible to continue a line comment onto another line with
|
|
|
+backslash-newline. You can put backslash-newline in the middle of an
|
|
|
+identifier, even a keyword, or an operator. You can even split
|
|
|
+@samp{/*}, @samp{*/}, and @samp{//} onto multiple lines with
|
|
|
+backslash-newline. Here's an ugly example:
|
|
|
+
|
|
|
+@example
|
|
|
+@group
|
|
|
+/\
|
|
|
+*
|
|
|
+*/ fo\
|
|
|
+o +\
|
|
|
+= 1\
|
|
|
+0;
|
|
|
+@end group
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+That's equivalent to @samp{/* */ foo += 10;}.
|
|
|
+
|
|
|
+Don't do those things in real programs, since they make code hard to
|
|
|
+read.
|
|
|
+
|
|
|
+@strong{Note:} For the sake of using certain tools on the source code, it is
|
|
|
+wise to end every source file with a newline character which is not
|
|
|
+preceded by a backslash, so that it really ends the last line.
|
|
|
+
|
|
|
+@node Arithmetic
|
|
|
+@chapter Arithmetic
|
|
|
+@cindex arithmetic operators
|
|
|
+@cindex operators, arithmetic
|
|
|
+
|
|
|
+@c ??? Duplication with other sections -- get rid of that?
|
|
|
+
|
|
|
+Arithmetic operators in C attempt to be as similar as possible to the
|
|
|
+abstract arithmetic operations, but it is impossible to do this
|
|
|
+perfectly. Numbers in a computer have a finite range of possible
|
|
|
+values, and non-integer values have a limit on their possible
|
|
|
+accuracy. Nonetheless, in most cases you will encounter no surprises
|
|
|
+in using @samp{+} for addition, @samp{-} for subtraction, and @samp{*}
|
|
|
+for multiplication.
|
|
|
+
|
|
|
+Each C operator has a @dfn{precedence}, which is its rank in the
|
|
|
+grammatical order of the various operators. The operators with the
|
|
|
+highest precedence grab adjoining operands first; these expressions
|
|
|
+then become operands for operators of lower precedence. We give some
|
|
|
+information about precedence of operators in this chapter where we
|
|
|
+describe the operators; for the full explanation, see @ref{Binary
|
|
|
+Operator Grammar}.
|
|
|
+
|
|
|
+The arithmetic operators always @dfn{promote} their operands before
|
|
|
+operating on them. This means converting narrow integer data types to
|
|
|
+a wider data type (@pxref{Operand Promotions}). If you are just
|
|
|
+learning C, don't worry about this yet.
|
|
|
+
|
|
|
+Given two operands that have different types, most arithmetic
|
|
|
+operations convert them both to their @dfn{common type}. For
|
|
|
+instance, if one is @code{int} and the other is @code{double}, the
|
|
|
+common type is @code{double}. (That's because @code{double} can
|
|
|
+represent all the values that an @code{int} can hold, but not vice
|
|
|
+versa.) For the full details, see @ref{Common Type}.
|
|
|
+
|
|
|
+@menu
|
|
|
+* Basic Arithmetic:: Addition, subtraction, multiplication,
|
|
|
+ and division.
|
|
|
+* Integer Arithmetic:: How C performs arithmetic with integer values.
|
|
|
+* Integer Overflow:: When an integer value exceeds the range
|
|
|
+ of its type.
|
|
|
+* Mixed Mode:: Calculating with both integer values
|
|
|
+ and floating-point values.
|
|
|
+* Division and Remainder:: How integer division works.
|
|
|
+* Numeric Comparisons:: Comparing numeric values for equality or order.
|
|
|
+* Shift Operations:: Shift integer bits left or right.
|
|
|
+* Bitwise Operations:: Bitwise conjunction, disjunction, negation.
|
|
|
+@end menu
|
|
|
+
|
|
|
+@node Basic Arithmetic
|
|
|
+@section Basic Arithmetic
|
|
|
+@cindex addition operator
|
|
|
+@cindex subtraction operator
|
|
|
+@cindex multiplication operator
|
|
|
+@cindex division operator
|
|
|
+@cindex negation operator
|
|
|
+@cindex operator, addition
|
|
|
+@cindex operator, subtraction
|
|
|
+@cindex operator, multiplication
|
|
|
+@cindex operator, division
|
|
|
+@cindex operator, negation
|
|
|
+
|
|
|
+Basic arithmetic in C is done with the usual binary operators of
|
|
|
+algebra: addition (@samp{+}), subtraction (@samp{-}), multiplication
|
|
|
+(@samp{*}) and division (@samp{/}). The unary operator @samp{-} is
|
|
|
+used to change the sign of a number. The unary @code{+} operator also
|
|
|
+exists; it yields its operand unaltered.
|
|
|
+
|
|
|
+@samp{/} is the division operator, but dividing integers may not give
|
|
|
+the result you expect. Its value is an integer, which is not equal to
|
|
|
+the mathematical quotient when that is a fraction. Use @samp{%} to
|
|
|
+get the corresponding integer remainder when necessary.
|
|
|
+@xref{Division and Remainder}. Floating point division yields value
|
|
|
+as close as possible to the mathematical quotient.
|
|
|
+
|
|
|
+These operators use algebraic syntax with the usual algebraic
|
|
|
+precedence rule (@pxref{Binary Operator Grammar}) that multiplication
|
|
|
+and division are done before addition and subtraction, but you can use
|
|
|
+parentheses to explicitly specify how the operators nest. They are
|
|
|
+left-associative (@pxref{Associativity and Ordering}). Thus,
|
|
|
+
|
|
|
+@example
|
|
|
+-a + b - c + d * e / f
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+is equivalent to
|
|
|
+
|
|
|
+@example
|
|
|
+(((-a) + b) - c) + ((d * e) / f)
|
|
|
+@end example
|
|
|
+
|
|
|
+@node Integer Arithmetic
|
|
|
+@section Integer Arithmetic
|
|
|
+@cindex integer arithmetic
|
|
|
+
|
|
|
+Each of the basic arithmetic operations in C has two variants for
|
|
|
+integers: @dfn{signed} and @dfn{unsigned}. The choice is determined
|
|
|
+by the data types of their operands.
|
|
|
+
|
|
|
+Each integer data type in C is either @dfn{signed} or @dfn{unsigned}.
|
|
|
+A signed type can hold a range of positive and negative numbers, with
|
|
|
+zero near the middle of the range. An unsigned type can hold only
|
|
|
+nonnegative numbers; its range starts with zero and runs upward.
|
|
|
+
|
|
|
+The most basic integer types are @code{int}, which normally can hold
|
|
|
+numbers from @minus{}2,147,483,648 to 2,147,483,647, and @code{unsigned
|
|
|
+int}, which normally can hold numbers from 0 to 4,294.967,295. (This
|
|
|
+assumes @code{int} is 32 bits wide, always true for GNU C on real
|
|
|
+computers but not always on embedded controllers.) @xref{Integer
|
|
|
+Types}, for full information about integer types.
|
|
|
+
|
|
|
+When a basic arithmetic operation is given two signed operands, it
|
|
|
+does signed arithmetic. Given two unsigned operands, it does
|
|
|
+unsigned arithmetic.
|
|
|
+
|
|
|
+If one operand is @code{unsigned int} and the other is @code{int}, the
|
|
|
+operator treats them both as unsigned. More generally, the common
|
|
|
+type of the operands determines whether the operation is signed or
|
|
|
+not. @xref{Common Type}.
|
|
|
+
|
|
|
+Printing the results of unsigned arithmetic with @code{printf} using
|
|
|
+@samp{%d} can produce surprising results for values far away from
|
|
|
+zero. Even though the rules above say that the computation was done
|
|
|
+with unsigned arithmetic, the printed result may appear to be signed!
|
|
|
+
|
|
|
+The explanation is that the bit pattern resulting from addition,
|
|
|
+subtraction or multiplication is actually the same for signed and
|
|
|
+unsigned operations. The difference is only in the data type of the
|
|
|
+result, which affects the @emph{interpretation} of the result bit pattern,
|
|
|
+and whether the arithmetic operation can overflow (see the next section).
|
|
|
+
|
|
|
+But @samp{%d} doesn't know its argument's data type. It sees only the
|
|
|
+value's bit pattern, and it is defined to interpret that as
|
|
|
+@code{signed int}. To print it as unsigned requires using @samp{%u}
|
|
|
+instead of @samp{%d}. @xref{Formatted Output, The GNU C Library, ,
|
|
|
+libc, The GNU C Library Reference Manual}.
|
|
|
+
|
|
|
+Arithmetic in C never operates directly on narrow integer types (those
|
|
|
+with fewer bits than @code{int}; @ref{Narrow Integers}). Instead it
|
|
|
+``promotes'' them to @code{int}. @xref{Operand Promotions}.
|
|
|
+
|
|
|
+@node Integer Overflow
|
|
|
+@section Integer Overflow
|
|
|
+@cindex integer overflow
|
|
|
+@cindex overflow, integer
|
|
|
+
|
|
|
+When the mathematical value of an arithmetic operation doesn't fit in
|
|
|
+the range of the data type in use, that's called @dfn{overflow}.
|
|
|
+When it happens in integer arithmetic, it is @dfn{integer overflow}.
|
|
|
+
|
|
|
+Integer overflow happens only in arithmetic operations. Type conversion
|
|
|
+operations, by definition, do not cause overflow, not even when the
|
|
|
+result can't fit in its new type. @xref{Integer Conversion}.
|
|
|
+
|
|
|
+Signed numbers use two's-complement representation, in which the most
|
|
|
+negative number lacks a positive counterpart (@pxref{Integers in
|
|
|
+Depth}). Thus, the unary @samp{-} operator on a signed integer can
|
|
|
+overflow.
|
|
|
+
|
|
|
+@menu
|
|
|
+* Unsigned Overflow:: Overlow in unsigned integer arithmetic.
|
|
|
+* Signed Overflow:: Overlow in signed integer arithmetic.
|
|
|
+@end menu
|
|
|
+
|
|
|
+@node Unsigned Overflow
|
|
|
+@subsection Overflow with Unsigned Integers
|
|
|
+
|
|
|
+Unsigned arithmetic in C ignores overflow; it produces the true result
|
|
|
+modulo the @var{n}th power of 2, where @var{n} is the number of bits
|
|
|
+in the data type. We say it ``truncates'' the true result to the
|
|
|
+lowest @var{n} bits.
|
|
|
+
|
|
|
+A true result that is negative, when taken modulo the @var{n}th power
|
|
|
+of 2, yields a positive number. For instance,
|
|
|
+
|
|
|
+@example
|
|
|
+unsigned int x = 1;
|
|
|
+unsigned int y;
|
|
|
+
|
|
|
+y = -x;
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+causes overflow because the negative number @minus{}1 can't be stored
|
|
|
+in an unsigned type. The actual result, which is @minus{}1 modulo the
|
|
|
+@var{n}th power of 2, is one less than the @var{n}th power of 2. That
|
|
|
+is the largest value that the unsigned data type can store. For a
|
|
|
+32-bit @code{unsigned int}, the value is 4,294,967,295. @xref{Maximum
|
|
|
+and Minimum Values}.
|
|
|
+
|
|
|
+Adding that number to itself, as here,
|
|
|
+
|
|
|
+@example
|
|
|
+unsigned int z;
|
|
|
+
|
|
|
+z = y + y;
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+ought to yield 8,489,934,590; however, that is again too large to fit,
|
|
|
+so overflow truncates the value to 4,294,967,294. If that were a
|
|
|
+signed integer, it would mean @minus{}2, which (not by coincidence)
|
|
|
+equals @minus{}1 + @minus{}1.
|
|
|
+
|
|
|
+@node Signed Overflow
|
|
|
+@subsection Overflow with Signed Integers
|
|
|
+@cindex compiler options for integer overflow
|
|
|
+@cindex integer overflow, compiler options
|
|
|
+@cindex overflow, compiler options
|
|
|
+
|
|
|
+For signed integers, the result of overflow in C is @emph{in
|
|
|
+principle} undefined, meaning that anything whatsoever could happen.
|
|
|
+Therefore, C compilers can do optimizations that treat the overflow
|
|
|
+case with total unconcern. (Since the result of overflow is undefined
|
|
|
+in principle, one cannot claim that these optimizations are
|
|
|
+erroneous.)
|
|
|
+
|
|
|
+@strong{Watch out:} These optimizations can do surprising things. For
|
|
|
+instance,
|
|
|
+
|
|
|
+@example
|
|
|
+int i;
|
|
|
+@r{@dots{}}
|
|
|
+if (i < i + 1)
|
|
|
+ x = 5;
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+could be optimized to do the assignment unconditionally, because the
|
|
|
+@code{if}-condition is always true if @code{i + 1} does not overflow.
|
|
|
+
|
|
|
+GCC offers compiler options to control handling signed integer
|
|
|
+overflow. These options operate per module; that is, each module
|
|
|
+behaves according to the options it was compiled with.
|
|
|
+
|
|
|
+These two options specify particular ways to handle signed integer
|
|
|
+overflow, other than the default way:
|
|
|
+
|
|
|
+@table @option
|
|
|
+@item -fwrapv
|
|
|
+Make signed integer operations well-defined, like unsigned integer
|
|
|
+operations: they produce the @var{n} low-order bits of the true
|
|
|
+result. The highest of those @var{n} bits is the sign bit of the
|
|
|
+result. With @option{-fwrapv}, these out-of-range operations are not
|
|
|
+considered overflow, so (strictly speaking) integer overflow never
|
|
|
+happens.
|
|
|
+
|
|
|
+The option @option{-fwrapv} enables some optimizations based on the
|
|
|
+defined values of out-of-range results. In GCC 8, it disables
|
|
|
+optimizations that are based on assuming signed integer operations
|
|
|
+will not overflow.
|
|
|
+
|
|
|
+@item -ftrapv
|
|
|
+Generate a signal @code{SIGFPE} when signed integer overflow occurs.
|
|
|
+This terminates the program unless the program handles the signal.
|
|
|
+@xref{Signals}.
|
|
|
+@end table
|
|
|
+
|
|
|
+One other option is useful for finding where overflow occurs:
|
|
|
+
|
|
|
+@ignore
|
|
|
+@item -fno-strict-overflow
|
|
|
+Disable optimizations that are based on assuming signed integer
|
|
|
+operations will not overflow.
|
|
|
+@end ignore
|
|
|
+
|
|
|
+@table @option
|
|
|
+@item -fsanitize=signed-integer-overflow
|
|
|
+Output a warning message at run time when signed integer overflow
|
|
|
+occurs. This checks the @samp{+}, @samp{*}, and @samp{-} operators.
|
|
|
+This takes priority over @option{-ftrapv}.
|
|
|
+@end table
|
|
|
+
|
|
|
+@node Mixed Mode
|
|
|
+@section Mixed-Mode Arithmetic
|
|
|
+
|
|
|
+Mixing integers and floating-point numbers in a basic arithmetic
|
|
|
+operation converts the integers automatically to floating point.
|
|
|
+In most cases, this gives exactly the desired results.
|
|
|
+But sometimes it matters precisely where the conversion occurs.
|
|
|
+
|
|
|
+If @code{i} and @code{j} are integers, @code{(i + j) * 2.0} adds them
|
|
|
+as an integer, then converts the sum to floating point for the
|
|
|
+multiplication. If the addition gets an overflow, that is not
|
|
|
+equivalent to converting both integers to floating point and then
|
|
|
+adding them. You can get the latter result by explicitly converting
|
|
|
+the integers, as in @code{((double) i + (double) j) * 2.0}.
|
|
|
+@xref{Explicit Type Conversion}.
|
|
|
+
|
|
|
+@c Eggert's report
|
|
|
+Adding or multiplying several values, including some integers and some
|
|
|
+floating point, does the operations left to right. Thus, @code{3.0 +
|
|
|
+i + j} converts @code{i} to floating point, then adds 3.0, then
|
|
|
+converts @code{j} to floating point and adds that. You can specify a
|
|
|
+different order using parentheses: @code{3.0 + (i + j)} adds @code{i}
|
|
|
+and @code{j} first and then adds that result (converting to floating
|
|
|
+point) to 3.0. In this respect, C differs from other languages, such
|
|
|
+as Fortran.
|
|
|
+
|
|
|
+@node Division and Remainder
|
|
|
+@section Division and Remainder
|
|
|
+@cindex remainder operator
|
|
|
+@cindex modulus
|
|
|
+@cindex operator, remainder
|
|
|
+
|
|
|
+Division of integers in C rounds the result to an integer. The result
|
|
|
+is always rounded towards zero.
|
|
|
+
|
|
|
+@example
|
|
|
+ 16 / 3 @result{} 5
|
|
|
+-16 / 3 @result{} -5
|
|
|
+ 16 / -3 @result{} -5
|
|
|
+-16 / -3 @result{} 5
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+To get the corresponding remainder, use the @samp{%} operator:
|
|
|
+
|
|
|
+@example
|
|
|
+ 16 % 3 @result{} 1
|
|
|
+-16 % 3 @result{} -1
|
|
|
+ 16 % -3 @result{} 1
|
|
|
+-16 % -3 @result{} -1
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+@samp{%} has the same operator precedence as @samp{/} and @samp{*}.
|
|
|
+
|
|
|
+From the rounded quotient and the remainder, you can reconstruct
|
|
|
+the dividend, like this:
|
|
|
+
|
|
|
+@example
|
|
|
+int
|
|
|
+original_dividend (int divisor, int quotient, int remainder)
|
|
|
+@{
|
|
|
+ return divisor * quotient + remainder;
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+To do unrounded division, use floating point. If only one operand is
|
|
|
+floating point, @samp{/} converts the other operand to floating
|
|
|
+point.
|
|
|
+
|
|
|
+@example
|
|
|
+16.0 / 3 @result{} 5.333333333333333
|
|
|
+16 / 3.0 @result{} 5.333333333333333
|
|
|
+16.0 / 3.0 @result{} 5.333333333333333
|
|
|
+16 / 3 @result{} 5
|
|
|
+@end example
|
|
|
+
|
|
|
+The remainder operator @samp{%} is not allowed for floating-point
|
|
|
+operands, because it is not needed. The concept of remainder makes
|
|
|
+sense for integers because the result of division of integers has to
|
|
|
+be an integer. For floating point, the result of division is a
|
|
|
+floating-point number, in other words a fraction, which will differ
|
|
|
+from the exact result only by a very small amount.
|
|
|
+
|
|
|
+There are functions in the standard C library to calculate remainders
|
|
|
+from integral-values division of floating-point numbers.
|
|
|
+@xref{Remainder Functions, The GNU C Library, , libc, The GNU C Library
|
|
|
+Reference Manual}.
|
|
|
+
|
|
|
+Integer division overflows in one specific case: dividing the smallest
|
|
|
+negative value for the data type (@pxref{Maximum and Minimum Values})
|
|
|
+by @minus{}1. That's because the correct result, which is the
|
|
|
+corresponding positive number, does not fit (@pxref{Integer Overflow})
|
|
|
+in the same number of bits. On some computers now in use, this always
|
|
|
+causes a signal @code{SIGFPE} (@pxref{Signals}), the same behavior
|
|
|
+that the option @option{-ftrapv} specifies (@pxref{Signed Overflow}).
|
|
|
+
|
|
|
+Division by zero leads to unpredictable results---depending on the
|
|
|
+type of computer, it might cause a signal @code{SIGFPE}, or it might
|
|
|
+produce a numeric result.
|
|
|
+
|
|
|
+@cindex division by zero
|
|
|
+@cindex zero, division by
|
|
|
+@strong{Watch out:} Make sure the program does not divide by zero. If
|
|
|
+you can't prove that the divisor is not zero, test whether it is zero,
|
|
|
+and skip the division if so.
|
|
|
+
|
|
|
+@node Numeric Comparisons
|
|
|
+@section Numeric Comparisons
|
|
|
+@cindex numeric comparisons
|
|
|
+@cindex comparisons
|
|
|
+@cindex operators, comparison
|
|
|
+@cindex equal operator
|
|
|
+@cindex not-equal operator
|
|
|
+@cindex less-than operator
|
|
|
+@cindex greater-than operator
|
|
|
+@cindex less-or-equal operator
|
|
|
+@cindex greater-or-equal operator
|
|
|
+@cindex operator, equal
|
|
|
+@cindex operator, not-equal
|
|
|
+@cindex operator, less-than
|
|
|
+@cindex operator, greater-than
|
|
|
+@cindex operator, less-or-equal
|
|
|
+@cindex operator, greater-or-equal
|
|
|
+@cindex truth value
|
|
|
+
|
|
|
+There are two kinds of comparison operators: @dfn{equality} and
|
|
|
+@dfn{ordering}. Equality comparisons test whether two expressions
|
|
|
+have the same value. The result is a @dfn{truth value}: a number that
|
|
|
+is 1 for ``true'' and 0 for ``false.''
|
|
|
+
|
|
|
+@example
|
|
|
+a == b /* @r{Test for equal.} */
|
|
|
+a != b /* @r{Test for not equal.} */
|
|
|
+@end example
|
|
|
+
|
|
|
+The equality comparison is written @code{==} because plain @code{=}
|
|
|
+is the assignment operator.
|
|
|
+
|
|
|
+Ordering comparisons test which operand is greater or less. Their
|
|
|
+results are truth values. These are the ordering comparisons of C:
|
|
|
+
|
|
|
+@example
|
|
|
+a < b /* @r{Test for less-than.} */
|
|
|
+a > b /* @r{Test for greater-than.} */
|
|
|
+a <= b /* @r{Test for less-than-or-equal.} */
|
|
|
+a >= b /* @r{Test for greater-than-or-equal.} */
|
|
|
+@end example
|
|
|
+
|
|
|
+For any integers @code{a} and @code{b}, exactly one of the comparisons
|
|
|
+@code{a < b}, @code{a == b} and @code{a > b} is true, just as in
|
|
|
+mathematics. However, if @code{a} and @code{b} are special floating
|
|
|
+point values (not ordinary numbers), all three can be false.
|
|
|
+@xref{Special Float Values}, and @ref{Invalid Optimizations}.
|
|
|
+
|
|
|
+@node Shift Operations
|
|
|
+@section Shift Operations
|
|
|
+@cindex shift operators
|
|
|
+@cindex operators, shift
|
|
|
+@cindex operators, shift
|
|
|
+@cindex shift count
|
|
|
+
|
|
|
+@dfn{Shifting} an integer means moving the bit values to the left or
|
|
|
+right within the bits of the data type. Shifting is defined only for
|
|
|
+integers. Here's the way to write it:
|
|
|
+
|
|
|
+@example
|
|
|
+/* @r{Left shift.} */
|
|
|
+5 << 2 @result{} 20
|
|
|
+
|
|
|
+/* @r{Right shift.} */
|
|
|
+5 >> 2 @result{} 1
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+The left operand is the value to be shifted, and the right operand
|
|
|
+says how many bits to shift it (the @dfn{shift count}). The left
|
|
|
+operand is promoted (@pxref{Operand Promotions}), so shifting never
|
|
|
+operates on a narrow integer type; it's always either @code{int} or
|
|
|
+wider. The value of the shift operator has the same type as the
|
|
|
+promoted left operand.
|
|
|
+
|
|
|
+@menu
|
|
|
+* Bits Shifted In:: How shifting makes new bits to shift in.
|
|
|
+* Shift Caveats:: Caveats of shift operations.
|
|
|
+* Shift Hacks:: Clever tricks with shift operations.
|
|
|
+@end menu
|
|
|
+
|
|
|
+@node Bits Shifted In
|
|
|
+@subsection Shifting Makes New Bits
|
|
|
+
|
|
|
+A shift operation shifts towards one end of the number and has to
|
|
|
+generate new bits at the other end.
|
|
|
+
|
|
|
+Shifting left one bit must generate a new least significant bit. It
|
|
|
+always brings in zero there. It is equivalent to multiplying by the
|
|
|
+appropriate power of 2. For example,
|
|
|
+
|
|
|
+@example
|
|
|
+5 << 3 @r{is equivalent to} 5 * 2*2*2
|
|
|
+-10 << 4 @r{is equivalent to} -10 * 2*2*2*2
|
|
|
+@end example
|
|
|
+
|
|
|
+The meaning of shifting right depends on whether the data type is
|
|
|
+signed or unsigned (@pxref{Signed and Unsigned Types}). For a signed
|
|
|
+data type, it performs ``arithmetic shift,'' which keeps the number's
|
|
|
+sign unchanged by duplicating the sign bit. For an unsigned data
|
|
|
+type, it performs ``logical shift,'' which always shifts in zeros at
|
|
|
+the most significant bit.
|
|
|
+
|
|
|
+In both cases, shifting right one bit is division by two, rounding
|
|
|
+towards negative infinity. For example,
|
|
|
+
|
|
|
+@example
|
|
|
+(unsigned) 19 >> 2 @result{} 4
|
|
|
+(unsigned) 20 >> 2 @result{} 5
|
|
|
+(unsigned) 21 >> 2 @result{} 5
|
|
|
+@end example
|
|
|
+
|
|
|
+For negative left operand @code{a}, @code{a >> 1} is not equivalent to
|
|
|
+@code{a / 2}. They both divide by 2, but @samp{/} rounds toward
|
|
|
+zero.
|
|
|
+
|
|
|
+The shift count must be zero or greater. Shifting by a negative
|
|
|
+number of bits gives machine-dependent results.
|
|
|
+
|
|
|
+@node Shift Caveats
|
|
|
+@subsection Caveats for Shift Operations
|
|
|
+
|
|
|
+@strong{Warning:} If the shift count is greater than or equal to the
|
|
|
+width in bits of the first operand, the results are machine-dependent.
|
|
|
+Logically speaking, the ``correct'' value would be either -1 (for
|
|
|
+right shift of a negative number) or 0 (in all other cases), but what
|
|
|
+it really generates is whatever the machine's shift instruction does in
|
|
|
+that case. So unless you can prove that the second operand is not too
|
|
|
+large, write code to check it at run time.
|
|
|
+
|
|
|
+@strong{Warning:} Never rely on how the shift operators relate in
|
|
|
+precedence to other arithmetic binary operators. Programmers don't
|
|
|
+remember these precedences, and won't understand the code. Always use
|
|
|
+parentheses to explicitly specify the nesting, like this:
|
|
|
+
|
|
|
+@example
|
|
|
+a + (b << 5) /* @r{Shift first, then add.} */
|
|
|
+(a + b) << 5 /* @r{Add first, then shift.} */
|
|
|
+@end example
|
|
|
+
|
|
|
+Note: according to the C standard, shifting of signed values isn't
|
|
|
+guaranteed to work properly when the value shifted is negative, or
|
|
|
+becomes negative during the operation of shifting left. However, only
|
|
|
+pedants have a reason to be concerned about this; only computers with
|
|
|
+strange shift instructions could plausibly do this wrong. In GNU C,
|
|
|
+the operation always works as expected,
|
|
|
+
|
|
|
+@node Shift Hacks
|
|
|
+@subsection Shift Hacks
|
|
|
+
|
|
|
+You can use the shift operators for various useful hacks. For
|
|
|
+example, given a date specified by day of the month @code{d}, month
|
|
|
+@code{m}, and year @code{y}, you can store the entire date in a single
|
|
|
+integer @code{date}:
|
|
|
+
|
|
|
+@example
|
|
|
+unsigned int d = 12;
|
|
|
+unsigned int m = 6;
|
|
|
+unsigned int y = 1983;
|
|
|
+unsigned int date = ((y << 4) + m) << 5) + d;
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+To extract the original day, month, and year out of
|
|
|
+@code{date}, use a combination of shift and remainder.
|
|
|
+
|
|
|
+@example
|
|
|
+d = date % 32;
|
|
|
+m = (date >> 5) % 16;
|
|
|
+y = date >> 9;
|
|
|
+@end example
|
|
|
+
|
|
|
+@code{-1 << LOWBITS} is a clever way to make an integer whose
|
|
|
+@code{LOWBITS} lowest bits are all 0 and the rest are all 1.
|
|
|
+@code{-(1 << LOWBITS)} is equivalent to that, due to associativity of
|
|
|
+multiplication, since negating a value is equivalent to multiplying it
|
|
|
+by @minus{}1.
|
|
|
+
|
|
|
+@node Bitwise Operations
|
|
|
+@section Bitwise Operations
|
|
|
+@cindex bitwise operators
|
|
|
+@cindex operators, bitwise
|
|
|
+@cindex negation, bitwise
|
|
|
+@cindex conjunction, bitwise
|
|
|
+@cindex disjunction, bitwise
|
|
|
+
|
|
|
+Bitwise operators operate on integers, treating each bit independently.
|
|
|
+They are not allowed for floating-point types.
|
|
|
+
|
|
|
+The examples in this section use binary constants, starting with
|
|
|
+@samp{0b} (@pxref{Integer Constants}). They stand for 32-bit integers
|
|
|
+of type @code{int}.
|
|
|
+
|
|
|
+@table @code
|
|
|
+@item ~@code{a}
|
|
|
+Unary operator for bitwise negation; this changes each bit of
|
|
|
+@code{a} from 1 to 0 or from 0 to 1.
|
|
|
+
|
|
|
+@example
|
|
|
+~0b10101000 @result{} 0b11111111111111111111111101010111
|
|
|
+~0 @result{} 0b11111111111111111111111111111111
|
|
|
+~0b11111111111111111111111111111111 @result{} 0
|
|
|
+~ (-1) @result{} 0
|
|
|
+@end example
|
|
|
+
|
|
|
+It is useful to remember that @code{~@var{x} + 1} equals
|
|
|
+@code{-@var{x}}, for integers, and @code{~@var{x}} equals
|
|
|
+@code{-@var{x} - 1}. The last example above shows this with @minus{}1
|
|
|
+as @var{x}.
|
|
|
+
|
|
|
+@item @code{a} & @code{b}
|
|
|
+Binary operator for bitwise ``and'' or ``conjunction.'' Each bit in
|
|
|
+the result is 1 if that bit is 1 in both @code{a} and @code{b}.
|
|
|
+
|
|
|
+@example
|
|
|
+0b10101010 & 0b11001100 @result{} 0b10001000
|
|
|
+@end example
|
|
|
+
|
|
|
+@item @code{a} | @code{b}
|
|
|
+Binary operator for bitwise ``or'' (``inclusive or'' or
|
|
|
+``disjunction''). Each bit in the result is 1 if that bit is 1 in
|
|
|
+either @code{a} or @code{b}.
|
|
|
+
|
|
|
+@example
|
|
|
+0b10101010 | 0b11001100 @result{} 0b11101110
|
|
|
+@end example
|
|
|
+
|
|
|
+@item @code{a} ^ @code{b}
|
|
|
+Binary operator for bitwise ``xor'' (``exclusive or''). Each bit in
|
|
|
+the result is 1 if that bit is 1 in exactly one of @code{a} and @code{b}.
|
|
|
+
|
|
|
+@example
|
|
|
+0b10101010 ^ 0b11001100 @result{} 0b01100110
|
|
|
+@end example
|
|
|
+@end table
|
|
|
+
|
|
|
+To understand the effect of these operators on signed integers, keep
|
|
|
+in mind that all modern computers use two's-complement representation
|
|
|
+(@pxref{Integer Representations}) for negative integers. This means
|
|
|
+that the highest bit of the number indicates the sign; it is 1 for a
|
|
|
+negative number and 0 for a positive number. In a negative number,
|
|
|
+the value in the other bits @emph{increases} as the number gets closer
|
|
|
+to zero, so that @code{0b111@r{@dots{}}111} is @minus{}1 and
|
|
|
+@code{0b100@r{@dots{}}000} is the most negative possible integer.
|
|
|
+
|
|
|
+@strong{Warning:} C defines a precedence ordering for the bitwise
|
|
|
+binary operators, but you should never rely on it. You should
|
|
|
+never rely on how bitwise binary operators relate in precedence to the
|
|
|
+arithmetic and shift binary operators. Other programmers don't
|
|
|
+remember this precedence ordering, so always use parentheses to
|
|
|
+explicitly specify the nesting.
|
|
|
+
|
|
|
+For example, suppose @code{offset} is an integer that specifies
|
|
|
+the offset within shared memory of a table, except that its bottom few
|
|
|
+bits (@code{LOWBITS} says how many) are special flags. Here's
|
|
|
+how to get just that offset and add it to the base address.
|
|
|
+
|
|
|
+@example
|
|
|
+shared_mem_base + (offset & (-1 << LOWBITS))
|
|
|
+@end example
|
|
|
+
|
|
|
+Thanks to the outer set of parentheses, we don't need to know whether
|
|
|
+@samp{&} has higher precedence than @samp{+}. Thanks to the inner
|
|
|
+set, we don't need to know whether @samp{&} has higher precedence than
|
|
|
+@samp{<<}. But we can rely on all unary operators to have higher
|
|
|
+precedence than any binary operator, so we don't need parentheses
|
|
|
+around the left operand of @samp{<<}.
|
|
|
+
|
|
|
+@node Assignment Expressions
|
|
|
+@chapter Assignment Expressions
|
|
|
+@cindex assignment expressions
|
|
|
+@cindex operators, assignment
|
|
|
+
|
|
|
+As a general concept in programming, an @dfn{assignment} is a
|
|
|
+construct that stores a new value into a place where values can be
|
|
|
+stored---for instance, in a variable. Such places are called
|
|
|
+@dfn{lvalues} (@pxref{Lvalues}) because they are locations that hold a value.
|
|
|
+
|
|
|
+An assignment in C is an expression because it has a value; we call
|
|
|
+it an @dfn{assignment expression}. A simple assignment looks like
|
|
|
+
|
|
|
+@example
|
|
|
+@var{lvalue} = @var{value-to-store}
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+We say it assigns the value of the expression @var{value-to-store} to
|
|
|
+the location @var{lvalue}, or that it stores @var{value-to-store}
|
|
|
+there. You can think of the ``l'' in ``lvalue'' as standing for
|
|
|
+``left,'' since that's what you put on the left side of the assignment
|
|
|
+operator.
|
|
|
+
|
|
|
+However, that's not the only way to use an lvalue, and not all lvalues
|
|
|
+can be assigned to. To use the lvalue in the left side of an
|
|
|
+assignment, it has to be @dfn{modifiable}. In C, that means it was
|
|
|
+not declared with the type qualifier @code{const} (@pxref{const}).
|
|
|
+
|
|
|
+The value of the assignment expression is that of @var{lvalue} after
|
|
|
+the new value is stored in it. This means you can use an assignment
|
|
|
+inside other expressions. Assignment operators are right-associative
|
|
|
+so that
|
|
|
+
|
|
|
+@example
|
|
|
+x = y = z = 0;
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+is equivalent to
|
|
|
+
|
|
|
+@example
|
|
|
+x = (y = (z = 0));
|
|
|
+@end example
|
|
|
+
|
|
|
+This is the only useful way for them to associate;
|
|
|
+the other way,
|
|
|
+
|
|
|
+@example
|
|
|
+((x = y) = z) = 0;
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+would be invalid since an assignment expression such as @code{x = y}
|
|
|
+is not valid as an lvalue.
|
|
|
+
|
|
|
+@strong{Warning:} Write parentheses around an assignment if you nest
|
|
|
+it inside another expression, unless that is a conditional expression,
|
|
|
+or comma-separated series, or another assignment.
|
|
|
+
|
|
|
+@menu
|
|
|
+* Simple Assignment:: The basics of storing a value.
|
|
|
+* Lvalues:: Expressions into which a value can be stored.
|
|
|
+* Modifying Assignment:: Shorthand for changing an lvalue's contents.
|
|
|
+* Increment/Decrement:: Shorthand for incrementing and decrementing
|
|
|
+ an lvalue's contents.
|
|
|
+* Postincrement/Postdecrement:: Accessing then incrementing or decrementing.
|
|
|
+* Assignment in Subexpressions:: How to avoid ambiguity.
|
|
|
+* Write Assignments Separately:: Write assignments as separate statements.
|
|
|
+@end menu
|
|
|
+
|
|
|
+@node Simple Assignment
|
|
|
+@section Simple Assignment
|
|
|
+@cindex simple assignment
|
|
|
+@cindex assignment, simple
|
|
|
+
|
|
|
+A @dfn{simple assignment expression} computes the value of the right
|
|
|
+operand and stores it into the lvalue on the left. Here is a simple
|
|
|
+assignment expression that stores 5 in @code{i}:
|
|
|
+
|
|
|
+@example
|
|
|
+i = 5
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+We say that this is an @dfn{assignment to} the variable @code{i} and
|
|
|
+that it @dfn{assigns} @code{i} the value 5. It has no semicolon
|
|
|
+because it is an expression (so it has a value). Adding a semicolon
|
|
|
+at the end would make it a statement (@pxref{Expression Statement}).
|
|
|
+
|
|
|
+Here is another example of a simple assignment expression. Its
|
|
|
+operands are not simple, but the kind of assignment done here is
|
|
|
+simple assignment.
|
|
|
+
|
|
|
+@example
|
|
|
+x[foo ()] = y + 6
|
|
|
+@end example
|
|
|
+
|
|
|
+A simple assignment with two different numeric data types converts the
|
|
|
+right operand value to the lvalue's type, if possible. It can convert
|
|
|
+any numeric type to any other numeric type.
|
|
|
+
|
|
|
+Simple assignment is also allowed on some non-numeric types: pointers
|
|
|
+(@pxref{Pointers}), structures (@pxref{Structure Assignment}), and
|
|
|
+unions (@pxref{Unions}).
|
|
|
+
|
|
|
+@strong{Warning:} Assignment is not allowed on arrays because
|
|
|
+there are no array values in C; C variables can be arrays, but these
|
|
|
+arrays cannot be manipulated as wholes. @xref{Limitations of C
|
|
|
+Arrays}.
|
|
|
+
|
|
|
+@xref{Assignment Type Conversions}, for the complete rules about data
|
|
|
+types used in assignments.
|
|
|
+
|
|
|
+@node Lvalues
|
|
|
+@section Lvalues
|
|
|
+@cindex lvalues
|
|
|
+
|
|
|
+An expression that identifies a memory space that holds a value is
|
|
|
+called an @dfn{lvalue}, because it is a location that can hold a value.
|
|
|
+
|
|
|
+The standard kinds of lvalues are:
|
|
|
+
|
|
|
+@itemize @bullet
|
|
|
+@item
|
|
|
+A variable.
|
|
|
+
|
|
|
+@item
|
|
|
+A pointer-dereference expression (@pxref{Pointer Dereference}) using
|
|
|
+unary @samp{*}.
|
|
|
+
|
|
|
+@item
|
|
|
+A structure field reference (@pxref{Structures}) using @samp{.}, if
|
|
|
+the structure value is an lvalue.
|
|
|
+
|
|
|
+@item
|
|
|
+A structure field reference using @samp{->}. This is always an lvalue
|
|
|
+since @samp{->} implies pointer dereference.
|
|
|
+
|
|
|
+@item
|
|
|
+A union alternative reference (@pxref{Unions}), on the same conditions
|
|
|
+as for structure fields.
|
|
|
+
|
|
|
+@item
|
|
|
+An array-element reference using @samp{[@r{@dots{}}]}, if the array
|
|
|
+is an lvalue.
|
|
|
+@end itemize
|
|
|
+
|
|
|
+If an expression's outermost operation is any other operator, that
|
|
|
+expression is not an lvalue. Thus, the variable @code{x} is an
|
|
|
+lvalue, but @code{x + 0} is not, even though these two expressions
|
|
|
+compute the same value (assuming @code{x} is a number).
|
|
|
+
|
|
|
+An array can be an lvalue (the rules above determine whether it is
|
|
|
+one), but using the array in an expression converts it automatically
|
|
|
+to a pointer to the first element. The result of this conversion is
|
|
|
+not an lvalue. Thus, if the variable @code{a} is an array, you can't
|
|
|
+use @code{a} by itself as the left operand of an assignment. But you
|
|
|
+can assign to an element of @code{a}, such as @code{a[0]}. That is an
|
|
|
+lvalue since @code{a} is an lvalue.
|
|
|
+
|
|
|
+@node Modifying Assignment
|
|
|
+@section Modifying Assignment
|
|
|
+@cindex modifying assignment
|
|
|
+@cindex assignment, modifying
|
|
|
+
|
|
|
+You can abbreviate the common construct
|
|
|
+
|
|
|
+@example
|
|
|
+@var{lvalue} = @var{lvalue} + @var{expression}
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+as
|
|
|
+
|
|
|
+@example
|
|
|
+@var{lvalue} += @var{expression}
|
|
|
+@end example
|
|
|
+
|
|
|
+This is known as a @dfn{modifying assignment}. For instance,
|
|
|
+
|
|
|
+@example
|
|
|
+i = i + 5;
|
|
|
+i += 5;
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+shows two statements that are equivalent. The first uses
|
|
|
+simple assignment; the second uses modifying assignment.
|
|
|
+
|
|
|
+Modifying assignment works with any binary arithmetic operator. For
|
|
|
+instance, you can subtract something from an lvalue like this,
|
|
|
+
|
|
|
+@example
|
|
|
+@var{lvalue} -= @var{expression}
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+or multiply it by a certain amount like this,
|
|
|
+
|
|
|
+@example
|
|
|
+@var{lvalue} *= @var{expression}
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+or shift it by a certain amount like this.
|
|
|
+
|
|
|
+@example
|
|
|
+@var{lvalue} <<= @var{expression}
|
|
|
+@var{lvalue} >>= @var{expression}
|
|
|
+@end example
|
|
|
+
|
|
|
+In most cases, this feature adds no power to the language, but it
|
|
|
+provides substantial convenience. Also, when @var{lvalue} contains
|
|
|
+code that has side effects, the simple assignment performs those side
|
|
|
+effects twice, while the modifying assignment performs them once. For
|
|
|
+instance,
|
|
|
+
|
|
|
+@example
|
|
|
+x[foo ()] = x[foo ()] + 5;
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+calls @code{foo} twice, and it could return different values each
|
|
|
+time. If @code{foo ()} returns 1 the first time and 3 the second
|
|
|
+time, then the effect could be to add @code{x[3]} and 5 and store the
|
|
|
+result in @code{x[1]}, or to add @code{x[1]} and 5 and store the
|
|
|
+result in @code{x[3]}. We don't know which of the two it will do,
|
|
|
+because C does not specify which call to @code{foo} is computed first.
|
|
|
+
|
|
|
+Such a statement is not well defined, and shouldn't be used.
|
|
|
+
|
|
|
+By contrast,
|
|
|
+
|
|
|
+@example
|
|
|
+x[foo ()] += 5;
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+is well defined: it calls @code{foo} only once to determine which
|
|
|
+element of @code{x} to adjust, and it adjusts that element by adding 5
|
|
|
+to it.
|
|
|
+
|
|
|
+@node Increment/Decrement
|
|
|
+@section Increment and Decrement Operators
|
|
|
+@cindex increment operator
|
|
|
+@cindex decrement operator
|
|
|
+@cindex operator, increment
|
|
|
+@cindex operator, decrement
|
|
|
+@cindex preincrement expression
|
|
|
+@cindex predecrement expression
|
|
|
+
|
|
|
+The operators @samp{++} and @samp{--} are the @dfn{increment} and
|
|
|
+@dfn{decrement} operators. When used on a numeric value, they add or
|
|
|
+subtract 1. We don't consider them assignments, but they are
|
|
|
+equivalent to assignments.
|
|
|
+
|
|
|
+Using @samp{++} or @samp{--} as a prefix, before an lvalue, is called
|
|
|
+@dfn{preincrement} or @dfn{predecrement}. This adds or subtracts 1
|
|
|
+and the result becomes the expression's value. For instance,
|
|
|
+
|
|
|
+@example
|
|
|
+#include <stdio.h> /* @r{Declares @code{printf}.} */
|
|
|
+
|
|
|
+int
|
|
|
+main (void)
|
|
|
+@{
|
|
|
+ int i = 5;
|
|
|
+ printf ("%d\n", i);
|
|
|
+ printf ("%d\n", ++i);
|
|
|
+ printf ("%d\n", i);
|
|
|
+ return 0;
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+prints lines containing 5, 6, and 6 again. The expression @code{++i}
|
|
|
+increments @code{i} from 5 to 6, and has the value 6, so the output
|
|
|
+from @code{printf} on that line says @samp{6}.
|
|
|
+
|
|
|
+Using @samp{--} instead, for predecrement,
|
|
|
+
|
|
|
+@example
|
|
|
+#include <stdio.h> /* @r{Declares @code{printf}.} */
|
|
|
+
|
|
|
+int
|
|
|
+main (void)
|
|
|
+@{
|
|
|
+ int i = 5;
|
|
|
+ printf ("%d\n", i);
|
|
|
+ printf ("%d\n", --i);
|
|
|
+ printf ("%d\n", i);
|
|
|
+ return 0;
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+prints three lines that contain (respectively) @samp{5}, @samp{4}, and
|
|
|
+again @samp{4}.
|
|
|
+
|
|
|
+@node Postincrement/Postdecrement
|
|
|
+@section Postincrement and Postdecrement
|
|
|
+@cindex postincrement expression
|
|
|
+@cindex postdecrement expression
|
|
|
+@cindex operator, postincrement
|
|
|
+@cindex operator, postdecrement
|
|
|
+
|
|
|
+Using @samp{++} or @samp{--} @emph{after} an lvalue does something
|
|
|
+peculiar: it gets the value directly out of the lvalue and @emph{then}
|
|
|
+increments or decrement it. Thus, the value of @code{i++} is the same
|
|
|
+as the value of @code{i}, but @code{i++} also increments @code{i} ``a
|
|
|
+little later.'' This is called @dfn{postincrement} or
|
|
|
+@dfn{postdecrement}.
|
|
|
+
|
|
|
+For example,
|
|
|
+
|
|
|
+@example
|
|
|
+#include <stdio.h> /* @r{Declares @code{printf}.} */
|
|
|
+
|
|
|
+int
|
|
|
+main (void)
|
|
|
+@{
|
|
|
+ int i = 5;
|
|
|
+ printf ("%d\n", i);
|
|
|
+ printf ("%d\n", i++);
|
|
|
+ printf ("%d\n", i);
|
|
|
+ return 0;
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+prints lines containing 5, again 5, and 6. The expression @code{i++}
|
|
|
+has the value 5, which is the value of @code{i} at the time,
|
|
|
+but it increments @code{i} from 5 to 6 just a little later.
|
|
|
+
|
|
|
+How much later is ``just a little later''? That is flexible. The
|
|
|
+increment has to happen by the next @dfn{sequence point}. In simple cases,
|
|
|
+that means by the end of the statement. @xref{Sequence Points}.
|
|
|
+
|
|
|
+If a unary operator precedes a postincrement or postincrement expression,
|
|
|
+the increment nests inside:
|
|
|
+
|
|
|
+@example
|
|
|
+-a++ @r{is equivalent to} -(a++)
|
|
|
+@end example
|
|
|
+
|
|
|
+That's the only order that makes sense; @code{-a} is not an lvalue, so
|
|
|
+it can't be incremented.
|
|
|
+
|
|
|
+@node Assignment in Subexpressions
|
|
|
+@section Pitfall: Assignment in Subexpressions
|
|
|
+@cindex assignment in subexpressions
|
|
|
+@cindex subexpressions, assignment in
|
|
|
+
|
|
|
+In C, the order of computing parts of an expression is not fixed.
|
|
|
+Aside from a few special cases, the operations can be computed in any
|
|
|
+order. If one part of the expression has an assignment to @code{x}
|
|
|
+and another part of the expression uses @code{x}, the result is
|
|
|
+unpredictable because that use might be computed before or after the
|
|
|
+assignment.
|
|
|
+
|
|
|
+Here's an example of ambiguous code:
|
|
|
+
|
|
|
+@example
|
|
|
+x = 20;
|
|
|
+printf ("%d %d\n", x, x = 4);
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+If the second argument, @code{x}, is computed before the third argument,
|
|
|
+@code{x = 4}, the second argument's value will be 20. If they are
|
|
|
+computed in the other order, the second argument's value will be 4.
|
|
|
+
|
|
|
+Here's one way to make that code unambiguous:
|
|
|
+
|
|
|
+@example
|
|
|
+y = 20;
|
|
|
+printf ("%d %d\n", y, x = 4);
|
|
|
+@end example
|
|
|
+
|
|
|
+Here's another way, with the other meaning:
|
|
|
+
|
|
|
+@example
|
|
|
+x = 4;
|
|
|
+printf ("%d %d\n", x, x);
|
|
|
+@end example
|
|
|
+
|
|
|
+This issue applies to all kinds of assignments, and to the increment
|
|
|
+and decrement operators, which are equivalent to assignments.
|
|
|
+@xref{Order of Execution}, for more information about this.
|
|
|
+
|
|
|
+However, it can be useful to write assignments inside an
|
|
|
+@code{if}-condition or @code{while}-test along with logical operators.
|
|
|
+@xref{Logicals and Assignments}.
|
|
|
+
|
|
|
+@node Write Assignments Separately
|
|
|
+@section Write Assignments in Separate Statements
|
|
|
+
|
|
|
+It is often convenient to write an assignment inside an
|
|
|
+@code{if}-condition, but that can reduce the readability of the
|
|
|
+program. Here's an example of what to avoid:
|
|
|
+
|
|
|
+@example
|
|
|
+if (x = advance (x))
|
|
|
+ @r{@dots{}}
|
|
|
+@end example
|
|
|
+
|
|
|
+The idea here is to advance @code{x} and test if the value is nonzero.
|
|
|
+However, readers might miss the fact that it uses @samp{=} and not
|
|
|
+@samp{==}. In fact, writing @samp{=} where @samp{==} was intended
|
|
|
+inside a condition is a common error, so GNU C can give warnings when
|
|
|
+@samp{=} appears in a way that suggests it's an error.
|
|
|
+
|
|
|
+It is much clearer to write the assignment as a separate statement, like this:
|
|
|
+
|
|
|
+@example
|
|
|
+x = advance (x);
|
|
|
+if (x != 0)
|
|
|
+ @r{@dots{}}
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+This makes it unmistakably clear that @code{x} is assigned a new value.
|
|
|
+
|
|
|
+Another method is to use the comma operator (@pxref{Comma Operator}),
|
|
|
+like this:
|
|
|
+
|
|
|
+@example
|
|
|
+if (x = advance (x), x != 0)
|
|
|
+ @r{@dots{}}
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+However, putting the assignment in a separate statement is usually clearer
|
|
|
+unless the assignment is very short, because it reduces nesting.
|
|
|
+
|
|
|
+@node Execution Control Expressions
|
|
|
+@chapter Execution Control Expressions
|
|
|
+@cindex execution control expressions
|
|
|
+@cindex expressions, execution control
|
|
|
+
|
|
|
+This chapter describes the C operators that combine expressions to
|
|
|
+control which of those expressions execute, or in which order.
|
|
|
+
|
|
|
+@menu
|
|
|
+* Logical Operators:: Logical conjunction, disjunction, negation.
|
|
|
+* Logicals and Comparison:: Logical operators with comparison operators.
|
|
|
+* Logicals and Assignments:: Assignments with logical operators.
|
|
|
+* Conditional Expression:: An if/else construct inside expressions.
|
|
|
+* Comma Operator:: Build a sequence of subexpressions.
|
|
|
+@end menu
|
|
|
+
|
|
|
+@node Logical Operators
|
|
|
+@section Logical Operators
|
|
|
+@cindex logical operators
|
|
|
+@cindex operators, logical
|
|
|
+@cindex conjunction operator
|
|
|
+@cindex disjunction operator
|
|
|
+@cindex negation operator, logical
|
|
|
+
|
|
|
+The @dfn{logical operators} combine truth values, which are normally
|
|
|
+represented in C as numbers. Any expression with a numeric value is a
|
|
|
+valid truth value: zero means false, and any other value means true.
|
|
|
+A pointer type is also meaningful as a truth value; a null pointer
|
|
|
+(which is zero) means false, and a non-null pointer means true
|
|
|
+(@pxref{Pointer Types}). The value of a logical operator is always 1
|
|
|
+or 0 and has type @code{int} (@pxref{Integer Types}).
|
|
|
+
|
|
|
+The logical operators are used mainly in the condition of an @code{if}
|
|
|
+statement, or in the end test in a @code{for} statement or
|
|
|
+@code{while} statement (@pxref{Statements}). However, they are valid
|
|
|
+in any context where an integer-valued expression is allowed.
|
|
|
+
|
|
|
+@table @samp
|
|
|
+@item ! @var{exp}
|
|
|
+Unary operator for logical ``not.'' The value is 1 (true) if
|
|
|
+@var{exp} is 0 (false), and 0 (false) if @var{exp} is nonzero (true).
|
|
|
+
|
|
|
+@strong{Warning:} if @code{exp} is anything but an lvalue or a
|
|
|
+function call, you should write parentheses around it.
|
|
|
+
|
|
|
+@item @var{left} && @var{right}
|
|
|
+The logical ``and'' binary operator computes @var{left} and, if necessary,
|
|
|
+@var{right}. If both of the operands are true, the @samp{&&} expression
|
|
|
+gives the value 1 (which is true). Otherwise, the @samp{&&} expression
|
|
|
+gives the value 0 (false). If @var{left} yields a false value,
|
|
|
+that determines the overall result, so @var{right} is not computed.
|
|
|
+
|
|
|
+@item @var{left} || @var{right}
|
|
|
+The logical ``or'' binary operator computes @var{left} and, if necessary,
|
|
|
+@var{right}. If at least one of the operands is true, the @samp{||} expression
|
|
|
+gives the value 1 (which is true). Otherwise, the @samp{||} expression
|
|
|
+gives the value 0 (false). If @var{left} yields a true value,
|
|
|
+that determines the overall result, so @var{right} is not computed.
|
|
|
+@end table
|
|
|
+
|
|
|
+@strong{Warning:} never rely on the relative precedence of @samp{&&}
|
|
|
+and @samp{||}. When you use them together, always use parentheses to
|
|
|
+specify explicitly how they nest, as shown here:
|
|
|
+
|
|
|
+@example
|
|
|
+if ((r != 0 && x % r == 0)
|
|
|
+ ||
|
|
|
+ (s != 0 && x % s == 0))
|
|
|
+@end example
|
|
|
+
|
|
|
+@node Logicals and Comparison
|
|
|
+@section Logical Operators and Comparisons
|
|
|
+
|
|
|
+The most common thing to use inside the logical operators is a
|
|
|
+comparison. Conveniently, @samp{&&} and @samp{||} have lower
|
|
|
+precedence than comparison operators and arithmetic operators, so we
|
|
|
+can write expressions like this without parentheses and get the
|
|
|
+nesting that is natural: two comparison operations that must both be
|
|
|
+true.
|
|
|
+
|
|
|
+@example
|
|
|
+if (r != 0 && x % r == 0)
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+This example also shows how it is useful that @samp{&&} guarantees to
|
|
|
+skip the right operand if the left one turns out false. Because of
|
|
|
+that, this code never tries to divide by zero.
|
|
|
+
|
|
|
+This is equivalent:
|
|
|
+
|
|
|
+@example
|
|
|
+if (r && x % r == 0)
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+A truth value is simply a number, so @code{r}
|
|
|
+as a truth value tests whether it is nonzero.
|
|
|
+But @code{r}'s meaning is not a truth value---it is a number to divide by.
|
|
|
+So it is better style to write the explicit @code{!= 0}.
|
|
|
+
|
|
|
+Here's another equivalent way to write it:
|
|
|
+
|
|
|
+@example
|
|
|
+if (!(r == 0) && x % r == 0)
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+This illustrates the unary @samp{!} operator, and the need to
|
|
|
+write parentheses around its operand.
|
|
|
+
|
|
|
+@node Logicals and Assignments
|
|
|
+@section Logical Operators and Assignments
|
|
|
+
|
|
|
+There are cases where assignments nested inside the condition can
|
|
|
+actually make a program @emph{easier} to read. Here is an example
|
|
|
+using a hypothetical type @code{list} which represents a list; it
|
|
|
+tests whether the list has at least two links, using hypothetical
|
|
|
+functions, @code{nonempty} which is true of the argument is a nonempty
|
|
|
+list, and @code{list_next} which advances from one list link to the
|
|
|
+next. We assume that a list is never a null pointer, so that the
|
|
|
+assignment expressions are always ``true.''
|
|
|
+
|
|
|
+@example
|
|
|
+if (nonempty (list)
|
|
|
+ && (temp1 = list_next (list))
|
|
|
+ && nonempty (temp1)
|
|
|
+ && (temp2 = list_next (temp1)))
|
|
|
+ @r{@dots{}} /* @r{use @code{temp1} and @code{temp2}} */
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+Here we get the benefit of the @samp{&&} operator, to avoid executing
|
|
|
+the rest of the code if a call to @code{nonempty} says ``false.'' The
|
|
|
+only natural place to put the assignments is among those calls.
|
|
|
+
|
|
|
+It would be possible to rewrite this as several statements, but that
|
|
|
+could make it much more cumbersome. On the other hand, when the test
|
|
|
+is even more complex than this one, splitting it into multiple
|
|
|
+statements might be necessary for clarity.
|
|
|
+
|
|
|
+If an empty list is a null pointer, we can dispense with calling
|
|
|
+@code{nonempty}:
|
|
|
+
|
|
|
+@example
|
|
|
+if ((temp1 = list_next (list))
|
|
|
+ && (temp2 = list_next (temp1)))
|
|
|
+ @r{@dots{}}
|
|
|
+@end example
|
|
|
+
|
|
|
+@node Conditional Expression
|
|
|
+@section Conditional Expression
|
|
|
+@cindex conditional expression
|
|
|
+@cindex expression, conditional
|
|
|
+
|
|
|
+C has a conditional expression that selects one of two expressions
|
|
|
+to compute and get the value from. It looks like this:
|
|
|
+
|
|
|
+@example
|
|
|
+@var{condition} ? @var{iftrue} : @var{iffalse}
|
|
|
+@end example
|
|
|
+
|
|
|
+@menu
|
|
|
+* Conditional Rules:: Rules for the conditional operator.
|
|
|
+* Conditional Branches:: About the two branches in a conditional.
|
|
|
+@end menu
|
|
|
+
|
|
|
+@node Conditional Rules
|
|
|
+@subsection Rules for Conditional Operator
|
|
|
+
|
|
|
+The first operand, @var{condition}, should be a value that can be
|
|
|
+compared with zero---a number or a pointer. If it is true (nonzero),
|
|
|
+then the conditional expression computes @var{iftrue} and its value
|
|
|
+becomes the value of the conditional expression. Otherwise the
|
|
|
+conditional expression computes @var{iffalse} and its value becomes
|
|
|
+the value of the conditional expression. The conditional expression
|
|
|
+always computes just one of @var{iftrue} and @var{iffalse}, never both
|
|
|
+of them.
|
|
|
+
|
|
|
+Here's an example: the absolute value of a number @code{x}
|
|
|
+can be written as @code{(x >= 0 ? x : -x)}.
|
|
|
+
|
|
|
+@strong{Warning:} The conditional expression operators have rather low
|
|
|
+syntactic precedence. Except when the conditional expression is used
|
|
|
+as an argument in a function call, write parentheses around it. For
|
|
|
+clarity, always write parentheses around it if it extends across more
|
|
|
+than one line.
|
|
|
+
|
|
|
+Assignment operators and the comma operator (@pxref{Comma Operator})
|
|
|
+have lower precedence than conditional expression operators, so write
|
|
|
+parentheses around those when they appear inside a conditional
|
|
|
+expression. @xref{Order of Execution}.
|
|
|
+
|
|
|
+@node Conditional Branches
|
|
|
+@subsection Conditional Operator Branches
|
|
|
+@cindex branches of conditional expression
|
|
|
+
|
|
|
+We call @var{iftrue} and @var{iffalse} the @dfn{branches} of the
|
|
|
+conditional.
|
|
|
+
|
|
|
+The two branches should normally have the same type, but a few
|
|
|
+exceptions are allowed. If they are both numeric types, the
|
|
|
+conditional converts both to their common type (@pxref{Common Type}).
|
|
|
+
|
|
|
+With pointers (@pxref{Pointers}), the two values can be pointers to
|
|
|
+nearly compatible types (@pxref{Compatible Types}). In this case, the
|
|
|
+result type is a similar pointer whose target type combines all the
|
|
|
+type qualifiers (@pxref{Type Qualifiers}) of both branches.
|
|
|
+
|
|
|
+If one branch has type @code{void *} and the other is a pointer to an
|
|
|
+object (not to a function), the conditional converts the @code{void *}
|
|
|
+branch to the type of the other.
|
|
|
+
|
|
|
+If one branch is an integer constant with value zero and the other is
|
|
|
+a pointer, the conditional converts zero to the pointer's type.
|
|
|
+
|
|
|
+In GNU C, you can omit @var{iftrue} in a conditional expression. In
|
|
|
+that case, if @var{condition} is nonzero, its value becomes the value of
|
|
|
+the conditional expression, after conversion to the common type.
|
|
|
+Thus,
|
|
|
+
|
|
|
+@example
|
|
|
+x ? : y
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+has the value of @code{x} if that is nonzero; otherwise, the value of
|
|
|
+@code{y}.
|
|
|
+
|
|
|
+@cindex side effect in ?:
|
|
|
+@cindex ?: side effect
|
|
|
+Omitting @var{iftrue} is useful when @var{condition} has side effects.
|
|
|
+In that case, writing that expression twice would carry out the side
|
|
|
+effects twice, but writing it once does them just once. For example,
|
|
|
+if we suppose that the function @code{next_element} advances a pointer
|
|
|
+variable to point to the next element in a list and returns the new
|
|
|
+pointer,
|
|
|
+
|
|
|
+@example
|
|
|
+next_element () ? : default_pointer
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+is a way to advance the pointer and use its new value if it isn't
|
|
|
+null, but use @code{default_pointer} if that is null. We must not do
|
|
|
+it this way,
|
|
|
+
|
|
|
+@example
|
|
|
+next_element () ? next_element () : default_pointer
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+because it would advance the pointer a second time.
|
|
|
+
|
|
|
+@node Comma Operator
|
|
|
+@section Comma Operator
|
|
|
+@cindex comma operator
|
|
|
+@cindex operator, comma
|
|
|
+
|
|
|
+The comma operator stands for sequential execution of expressions.
|
|
|
+The value of the comma expression comes from the last expression in
|
|
|
+the sequence; the previous expressions are computed only for their
|
|
|
+side effects. It looks like this:
|
|
|
+
|
|
|
+@example
|
|
|
+@var{exp1}, @var{exp2} @r{@dots{}}
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+You can bundle any number of expressions together this way, by putting
|
|
|
+commas between them.
|
|
|
+
|
|
|
+@menu
|
|
|
+* Uses of Comma:: When to use the comma operator.
|
|
|
+* Clean Comma:: Clean use of the comma operator.
|
|
|
+* Avoid Comma:: When to not use the comma operator.
|
|
|
+@end menu
|
|
|
+
|
|
|
+@node Uses of Comma
|
|
|
+@subsection The Uses of the Comma Operator
|
|
|
+
|
|
|
+With commas, you can put several expressions into a place that
|
|
|
+requires just one expression---for example, in the header of a
|
|
|
+@code{for} statement. This statement
|
|
|
+
|
|
|
+@example
|
|
|
+for (i = 0, j = 10, k = 20; i < n; i++)
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+contains three assignment expressions, to initialize @code{i}, @code{j}
|
|
|
+and @code{k}. The syntax of @code{for} requires just one expression
|
|
|
+for initialization; to include three assignments, we use commas to
|
|
|
+bundle them into a single larger expression, @code{i = 0, j = 10, k =
|
|
|
+20}. This technique is also useful in the loop-advance expression,
|
|
|
+the last of the three inside the @code{for} parentheses.
|
|
|
+
|
|
|
+In the @code{for} statement and the @code{while} statement
|
|
|
+(@pxref{Loop Statements}), a comma provides a way to perform some side
|
|
|
+effect before the loop-exit test. For example,
|
|
|
+
|
|
|
+@example
|
|
|
+while (printf ("At the test, x = %d\n", x), x != 0)
|
|
|
+@end example
|
|
|
+
|
|
|
+@node Clean Comma
|
|
|
+@subsection Clean Use of the Comma Operator
|
|
|
+
|
|
|
+Always write parentheses around a series of comma operators, except
|
|
|
+when it is at top level in an expression statement, or within the
|
|
|
+parentheses of an @code{if}, @code{for}, @code{while}, or @code{switch}
|
|
|
+statement (@pxref{Statements}). For instance, in
|
|
|
+
|
|
|
+@example
|
|
|
+for (i = 0, j = 10, k = 20; i < n; i++)
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+the commas between the assignments are clear because they are between
|
|
|
+a parenthesis and a semicolon.
|
|
|
+
|
|
|
+The arguments in a function call are also separated by commas, but that is
|
|
|
+not an instance of the comma operator. Note the difference between
|
|
|
+
|
|
|
+@example
|
|
|
+foo (4, 5, 6)
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+which passes three arguments to @code{foo} and
|
|
|
+
|
|
|
+@example
|
|
|
+foo ((4, 5, 6))
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+which uses the comma operator and passes just one argument
|
|
|
+(with value 6).
|
|
|
+
|
|
|
+@strong{Warning:} don't use the comma operator around an argument
|
|
|
+of a function unless it helps understand the code. When you do so,
|
|
|
+don't put part of another argument on the same line. Instead, add a
|
|
|
+line break to make the parentheses around the comma operator easier to
|
|
|
+see, like this.
|
|
|
+
|
|
|
+@example
|
|
|
+foo ((mumble (x, y), frob (z)),
|
|
|
+ *p)
|
|
|
+@end example
|
|
|
+
|
|
|
+@node Avoid Comma
|
|
|
+@subsection When Not to Use the Comma Operator
|
|
|
+
|
|
|
+You can use a comma in any subexpression, but in most cases it only
|
|
|
+makes the code confusing, and it is clearer to raise all but the last
|
|
|
+of the comma-separated expressions to a higher level. Thus, instead
|
|
|
+of this:
|
|
|
+
|
|
|
+@example
|
|
|
+x = (y += 4, 8);
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+it is much clearer to write this:
|
|
|
+
|
|
|
+@example
|
|
|
+y += 4, x = 8;
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+or this:
|
|
|
+
|
|
|
+@example
|
|
|
+y += 4;
|
|
|
+x = 8;
|
|
|
+@end example
|
|
|
+
|
|
|
+Use commas only in the cases where there is no clearer alternative
|
|
|
+involving multiple statements.
|
|
|
+
|
|
|
+By contrast, don't hesitate to use commas in the expansion in a macro
|
|
|
+definition. The trade-offs of code clarity are different in that
|
|
|
+case, because the @emph{use} of the macro may improve overall clarity
|
|
|
+so much that the ugliness of the macro's @emph{definition} is a small
|
|
|
+price to pay. @xref{Macros}.
|
|
|
+
|
|
|
+@node Binary Operator Grammar
|
|
|
+@chapter Binary Operator Grammar
|
|
|
+@cindex binary operator grammar
|
|
|
+@cindex grammar, binary operator
|
|
|
+@cindex operator precedence
|
|
|
+@cindex precedence, operator
|
|
|
+@cindex left-associative
|
|
|
+
|
|
|
+@dfn{Binary operators} are those that take two operands, one
|
|
|
+on the left and one on the right.
|
|
|
+
|
|
|
+All the binary operators in C are syntactically left-associative.
|
|
|
+This means that @w{@code{a @var{op} b @var{op} c}} means @w{@code{(a
|
|
|
+@var{op} b) @var{op} c}}. However, you should only write repeated
|
|
|
+operators without parentheses using @samp{+}, @samp{-}, @samp{*} and
|
|
|
+@samp{/}, because those cases are clear from algebra. So it is ok to
|
|
|
+write @code{a + b + c} or @code{a - b - c}, but never @code{a == b ==
|
|
|
+c} or @code{a % b % c}.
|
|
|
+
|
|
|
+Each C operator has a @dfn{precedence}, which is its rank in the
|
|
|
+grammatical order of the various operators. The operators with the
|
|
|
+highest precedence grab adjoining operands first; these expressions
|
|
|
+then become operands for operators of lower precedence.
|
|
|
+
|
|
|
+The precedence order of operators in C is fully specified, so any
|
|
|
+combination of operations leads to a well-defined nesting. We state
|
|
|
+only part of the full precedence ordering here because it is bad
|
|
|
+practice for C code to depend on the other cases. For cases not
|
|
|
+specified in this chapter, always use parentheses to make the nesting
|
|
|
+explicit.@footnote{Personal note from Richard Stallman: I wrote GCC without
|
|
|
+remembering anything about the C precedence order beyond what's stated
|
|
|
+here. I studied the full precedence table to write the parser, and
|
|
|
+promptly forgot it again. If you need to look up the full precedence order
|
|
|
+to understand some C code, fix the code with parentheses so nobody else
|
|
|
+needs to do that.}
|
|
|
+
|
|
|
+You can depend on this subsequence of the precedence ordering
|
|
|
+(stated from highest precedence to lowest):
|
|
|
+
|
|
|
+@enumerate
|
|
|
+@item
|
|
|
+Component access (@samp{.} and @samp{->}).
|
|
|
+
|
|
|
+@item
|
|
|
+Unary prefix operators.
|
|
|
+
|
|
|
+@item
|
|
|
+Unary postfix operators.
|
|
|
+
|
|
|
+@item
|
|
|
+Multiplication, division, and remainder (they have the same precedence).
|
|
|
+
|
|
|
+@item
|
|
|
+Addition and subtraction (they have the same precedence).
|
|
|
+
|
|
|
+@item
|
|
|
+Comparisons---but watch out!
|
|
|
+
|
|
|
+@item
|
|
|
+Logical operators @samp{&&} and @samp{||}---but watch out!
|
|
|
+
|
|
|
+@item
|
|
|
+Conditional expression with @samp{?} and @samp{:}.
|
|
|
+
|
|
|
+@item
|
|
|
+Assignments.
|
|
|
+
|
|
|
+@item
|
|
|
+Sequential execution (the comma operator, @samp{,}).
|
|
|
+@end enumerate
|
|
|
+
|
|
|
+Two of the lines in the above list say ``but watch out!'' That means
|
|
|
+that the line covers operators with subtly different precedence.
|
|
|
+Never depend on the grammar of C to decide how two comparisons nest;
|
|
|
+instead, always use parentheses to specify their nesting.
|
|
|
+
|
|
|
+You can let several @samp{&&} operators associate, or several
|
|
|
+@samp{||} operators, but always use parentheses to show how @samp{&&}
|
|
|
+and @samp{||} nest with each other. @xref{Logical Operators}.
|
|
|
+
|
|
|
+There is one other precedence ordering that code can depend on:
|
|
|
+
|
|
|
+@enumerate
|
|
|
+@item
|
|
|
+Unary postfix operators.
|
|
|
+
|
|
|
+@item
|
|
|
+Bitwise and shift operators---but watch out!
|
|
|
+
|
|
|
+@item
|
|
|
+Conditional expression with @samp{?} and @samp{:}.
|
|
|
+@end enumerate
|
|
|
+
|
|
|
+The caveat for bitwise and shift operators is like that for logical
|
|
|
+operators: you can let multiple uses of one bitwise operator
|
|
|
+associate, but always use parentheses to control nesting of dissimilar
|
|
|
+operators.
|
|
|
+
|
|
|
+These lists do not specify any precedence ordering between the bitwise
|
|
|
+and shift operators of the second list and the binary operators above
|
|
|
+conditional expressions in the first list. When they come together,
|
|
|
+parenthesize them. @xref{Bitwise Operations}.
|
|
|
+
|
|
|
+@node Order of Execution
|
|
|
+@chapter Order of Execution
|
|
|
+@cindex order of execution
|
|
|
+
|
|
|
+The order of execution of a C program is not always obvious, and not
|
|
|
+necessarily predictable. This chapter describes what you can count on.
|
|
|
+
|
|
|
+@menu
|
|
|
+* Reordering of Operands:: Operations in C are not necessarily computed
|
|
|
+ in the order they are written.
|
|
|
+* Associativity and Ordering:: Some associative operations are performed
|
|
|
+ in a particular order; others are not.
|
|
|
+* Sequence Points:: Some guarantees about the order of operations.
|
|
|
+* Postincrement and Ordering:: Ambiguous excution order with postincrement.
|
|
|
+* Ordering of Operands:: Evaluation order of operands
|
|
|
+ and function arguments.
|
|
|
+* Optimization and Ordering:: Compiler optimizations can reorder operations
|
|
|
+ only if it has no impact on program results.
|
|
|
+@end menu
|
|
|
+
|
|
|
+@node Reordering of Operands
|
|
|
+@section Reordering of Operands
|
|
|
+@cindex ordering of operands
|
|
|
+@cindex reordering of operands
|
|
|
+@cindex operand execution ordering
|
|
|
+
|
|
|
+The C language does not necessarily carry out operations within an
|
|
|
+expression in the order they appear in the code. For instance, in
|
|
|
+this expression,
|
|
|
+
|
|
|
+@example
|
|
|
+foo () + bar ()
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+@code{foo} might be called first or @code{bar} might be called first.
|
|
|
+If @code{foo} updates a datum and @code{bar} uses that datum, the
|
|
|
+results can be unpredictable.
|
|
|
+
|
|
|
+The unpredictable order of computation of subexpressions also makes a
|
|
|
+difference when one of them contains an assignment. We already saw
|
|
|
+this example of bad code,
|
|
|
+
|
|
|
+@example
|
|
|
+x = 20;
|
|
|
+printf ("%d %d\n", x, x = 4);
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+in which the second argument, @code{x}, has a different value
|
|
|
+depending on whether it is computed before or after the assignment in
|
|
|
+the third argument.
|
|
|
+
|
|
|
+@node Associativity and Ordering
|
|
|
+@section Associativity and Ordering
|
|
|
+@cindex associativity and ordering
|
|
|
+
|
|
|
+An associative binary operator, such as @code{+}, when used repeatedly
|
|
|
+can combine any number of operands. The operands' values may be
|
|
|
+computed in any order.
|
|
|
+
|
|
|
+If the values are integers and overflow can be ignored, they may be
|
|
|
+combined in any order. Thus, given four functions that return
|
|
|
+@code{unsigned int}, calling them and adding their results as here
|
|
|
+
|
|
|
+@example
|
|
|
+(foo () + bar ()) + (baz () + quux ())
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+may add up the results in any order.
|
|
|
+
|
|
|
+By contrast, arithmetic on signed integers, with overflow significant,
|
|
|
+is not really associative (@pxref{Integer Overflow}). Thus, the
|
|
|
+additions must be done in the order specified, obeying parentheses and
|
|
|
+left-association. That means computing @code{(foo () + bar ())} and
|
|
|
+@code{(baz () + quux ())} first (in either order), then adding the
|
|
|
+two.
|
|
|
+
|
|
|
+The same applies to arithmetic on floating-point values, since that
|
|
|
+too is not really associative. However, the GCC option
|
|
|
+@option{-funsafe-math-optimizations} allows the compiler to change the
|
|
|
+order of calculation when an associative operation (associative in
|
|
|
+exact mathematics) combines several operands. The option takes effect
|
|
|
+when compiling a module (@pxref{Compilation}). Changing the order
|
|
|
+of association can enable the program to pipeline the floating point
|
|
|
+operations.
|
|
|
+
|
|
|
+In all these cases, the four function calls can be done in any order.
|
|
|
+There is no right or wrong about that.
|
|
|
+
|
|
|
+@node Sequence Points
|
|
|
+@section Sequence Points
|
|
|
+@cindex sequence points
|
|
|
+@cindex full expression
|
|
|
+
|
|
|
+There are some points in the code where C makes limited guarantees
|
|
|
+about the order of operations. These are called @dfn{sequence
|
|
|
+points}. Here is where they occur:
|
|
|
+
|
|
|
+@itemize @bullet
|
|
|
+@item
|
|
|
+At the end of a @dfn{full expression}; that is to say, an expression
|
|
|
+that is not part of a larger expression. All side effects specified
|
|
|
+by that expression are carried out before execution moves
|
|
|
+on to subsequent code.
|
|
|
+
|
|
|
+@item
|
|
|
+At the end of the first operand of certain operators: @samp{,},
|
|
|
+@samp{&&}, @samp{||}, and @samp{?:}. All side effects specified by
|
|
|
+that expression are carried out before any execution of the
|
|
|
+next operand.
|
|
|
+
|
|
|
+The commas that separate arguments in a function call are @emph{not}
|
|
|
+comma operators, and they do not create sequence points. The rule
|
|
|
+for function arguments and the rule for operands are different
|
|
|
+(@pxref{Ordering of Operands}).
|
|
|
+
|
|
|
+@item
|
|
|
+Just before calling a function. All side effects specified by the
|
|
|
+argument expressions are carried out before calling the function.
|
|
|
+
|
|
|
+If the function to be called is not constant---that is, if it is
|
|
|
+computed by an expression---all side effects in that expression are
|
|
|
+carried out before calling the function.
|
|
|
+@end itemize
|
|
|
+
|
|
|
+The ordering imposed by a sequence point applies locally to a limited
|
|
|
+range of code, as stated above in each case. For instance, the
|
|
|
+ordering imposed by the comma operator does not apply to code outside
|
|
|
+that comma operator. Thus, in this code,
|
|
|
+
|
|
|
+@example
|
|
|
+(x = 5, foo (x)) + x * x
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+the sequence point of the comma operator orders @code{x = 5} before
|
|
|
+@code{foo (x)}, but @code{x * x} could be computed before or after
|
|
|
+them.
|
|
|
+
|
|
|
+@node Postincrement and Ordering
|
|
|
+@section Postincrement and Ordering
|
|
|
+@cindex postincrement and ordering
|
|
|
+@cindex ordering and postincrement
|
|
|
+
|
|
|
+Ordering requirements are loose with the postincrement and
|
|
|
+postdecrement operations (@pxref{Postincrement/Postdecrement}), which
|
|
|
+specify side effects to happen ``a little later.'' They must happen
|
|
|
+before the next sequence point, but that still leaves room for various
|
|
|
+meanings. In this expression,
|
|
|
+
|
|
|
+@example
|
|
|
+z = x++ - foo ()
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+it's unpredictable whether @code{x} gets incremented before or after
|
|
|
+calling the function @code{foo}. If @code{foo} refers to @code{x},
|
|
|
+it might see the old value or it might see the incremented value.
|
|
|
+
|
|
|
+In this perverse expression,
|
|
|
+
|
|
|
+@example
|
|
|
+x = x++
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+@code{x} will certainly be incremented but the incremented value may
|
|
|
+not stick. If the incrementation of @code{x} happens after the
|
|
|
+assignment to @code{x}, the incremented value will remain in place.
|
|
|
+But if the incrementation happens first, the assignment will overwrite
|
|
|
+that with the not-yet-incremented value, so the expression as a whole
|
|
|
+will leave @code{x} unchanged.
|
|
|
+
|
|
|
+@node Ordering of Operands
|
|
|
+@section Ordering of Operands
|
|
|
+@cindex ordering of operands
|
|
|
+@cindex operand ordering
|
|
|
+
|
|
|
+Operands and arguments can be computed in any order, but there are limits to
|
|
|
+this intermixing in GNU C:
|
|
|
+
|
|
|
+@itemize @bullet
|
|
|
+@item
|
|
|
+The operands of a binary arithmetic operator can be computed in either
|
|
|
+order, but they can't be intermixed: one of them has to come first,
|
|
|
+followed by the other. Any side effects in the operand that's computed
|
|
|
+first are executed before the other operand is computed.
|
|
|
+
|
|
|
+@item
|
|
|
+That applies to assignment operators too, except that in simple assignment
|
|
|
+the previous value of the left operand is unused.
|
|
|
+
|
|
|
+@item
|
|
|
+The arguments in a function call can be computed in any order, but
|
|
|
+they can't be intermixed. Thus, one argument is fully computed, then
|
|
|
+another, and so on until they are all done. Any side effects in one argument
|
|
|
+are executed before computation of another argument begins.
|
|
|
+@end itemize
|
|
|
+
|
|
|
+These rules don't cover side effects caused by postincrement and
|
|
|
+postdecrement operators---those can be deferred up to the next
|
|
|
+sequence point.
|
|
|
+
|
|
|
+If you want to get pedantic, the fact is that GCC can reorder the
|
|
|
+computations in many other ways provided that doesn't alter the result
|
|
|
+of running the program. However, because they don't alter the result
|
|
|
+of running the program, they are negligible, unless you are concerned
|
|
|
+with the values in certain variables at various times as seen by other
|
|
|
+processes. In those cases, you can use @code{volatile} to prevent
|
|
|
+optimizations that would make them behave strangely. @xref{volatile}.
|
|
|
+
|
|
|
+@node Optimization and Ordering
|
|
|
+@section Optimization and Ordering
|
|
|
+@cindex optimization and ordering
|
|
|
+@cindex ordering and optimization
|
|
|
+
|
|
|
+Sequence points limit the compiler's freedom to reorder operations
|
|
|
+arbitrarily, but optimizations can still reorder them if the compiler
|
|
|
+concludes that this won't alter the results. Thus, in this code,
|
|
|
+
|
|
|
+@example
|
|
|
+x++;
|
|
|
+y = z;
|
|
|
+x++;
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+there is a sequence point after each statement, so the code is
|
|
|
+supposed to increment @code{x} once before the assignment to @code{y}
|
|
|
+and once after. However, incrementing @code{x} has no effect on
|
|
|
+@code{y} or @code{z}, and setting @code{y} can't affect @code{x}, so
|
|
|
+the code could be optimized into this:
|
|
|
+
|
|
|
+@example
|
|
|
+y = z;
|
|
|
+x += 2;
|
|
|
+@end example
|
|
|
+
|
|
|
+Normally that has no effect except to make the program faster. But
|
|
|
+there are special situations where it can cause trouble due to things
|
|
|
+that the compiler cannot know about, such as shared memory. To limit
|
|
|
+optimization in those places, use the @code{volatile} type qualifier
|
|
|
+(@pxref{volatile}).
|
|
|
+
|
|
|
+@node Primitive Types
|
|
|
+@chapter Primitive Data Types
|
|
|
+@cindex primitive types
|
|
|
+@cindex types, primitive
|
|
|
+
|
|
|
+This chapter describes all the primitive data types of C---that is,
|
|
|
+all the data types that aren't built up from other types. They
|
|
|
+include the types @code{int} and @code{double} that we've already covered.
|
|
|
+
|
|
|
+@menu
|
|
|
+* Integer Types:: Description of integer types.
|
|
|
+* Floating-Point Data Types:: Description of floating-point types.
|
|
|
+* Complex Data Types:: Description of complex number types.
|
|
|
+* The Void Type:: A type indicating no value at all.
|
|
|
+* Other Data Types:: A brief summary of other types.
|
|
|
+* Type Designators:: Referring to a data type abstractly.
|
|
|
+@end menu
|
|
|
+
|
|
|
+These types are all made up of bytes (@pxref{Storage}).
|
|
|
+
|
|
|
+@node Integer Types
|
|
|
+@section Integer Data Types
|
|
|
+@cindex integer types
|
|
|
+@cindex types, integer
|
|
|
+
|
|
|
+Here we describe all the integer types and their basic
|
|
|
+characteristics. @xref{Integers in Depth}, for more information about
|
|
|
+the bit-level integer data representations and arithmetic.
|
|
|
+
|
|
|
+@menu
|
|
|
+* Basic Integers:: Overview of the various kinds of integers.
|
|
|
+* Signed and Unsigned Types:: Integers can either hold both negative and
|
|
|
+ non-negative values, or only non-negative.
|
|
|
+* Narrow Integers:: When to use smaller integer types.
|
|
|
+* Integer Conversion:: Casting a value from one integer type
|
|
|
+ to another.
|
|
|
+* Boolean Type:: An integer type for boolean values.
|
|
|
+* Integer Variations:: Sizes of integer types can vary
|
|
|
+ across platforms.
|
|
|
+@end menu
|
|
|
+
|
|
|
+@node Basic Integers
|
|
|
+@subsection Basic Integers
|
|
|
+
|
|
|
+@findex char
|
|
|
+@findex int
|
|
|
+@findex short int
|
|
|
+@findex long int
|
|
|
+@findex long long int
|
|
|
+
|
|
|
+Integer data types in C can be signed or unsigned. An unsigned type
|
|
|
+can represent only positive numbers and zero. A signed type can
|
|
|
+represent both positive and negative numbers, in a range spread almost
|
|
|
+equally on both sides of zero.
|
|
|
+
|
|
|
+Aside from signedness, the integer data types vary in size: how many
|
|
|
+bytes long they are. The size determines how many different integer
|
|
|
+values the type can hold.
|
|
|
+
|
|
|
+Here's a list of the signed integer data types, with the sizes they
|
|
|
+have on most computers. Each has a corresponding unsigned type; see
|
|
|
+@ref{Signed and Unsigned Types}.
|
|
|
+
|
|
|
+@table @code
|
|
|
+@item signed char
|
|
|
+One byte (8 bits). This integer type is used mainly for integers that
|
|
|
+represent characters, as part of arrays or other data structures.
|
|
|
+
|
|
|
+@item short
|
|
|
+@itemx short int
|
|
|
+Two bytes (16 bits).
|
|
|
+
|
|
|
+@item int
|
|
|
+Four bytes (32 bits).
|
|
|
+
|
|
|
+@item long
|
|
|
+@itemx long int
|
|
|
+Four bytes (32 bits) or eight bytes (64 bits), depending on the
|
|
|
+platform. Typically it is 32 bits on 32-bit computers
|
|
|
+and 64 bits on 64-bit computers, but there are exceptions.
|
|
|
+
|
|
|
+@item long long
|
|
|
+@itemx long long int
|
|
|
+Eight bytes (64 bits). Supported in GNU C in the 1980s, and
|
|
|
+incorporated into standard C as of ISO C99.
|
|
|
+@end table
|
|
|
+
|
|
|
+You can omit @code{int} when you use @code{long} or @code{short}.
|
|
|
+This is harmless and customary.
|
|
|
+
|
|
|
+@node Signed and Unsigned Types
|
|
|
+@subsection Signed and Unsigned Types
|
|
|
+@cindex signed types
|
|
|
+@cindex unsigned types
|
|
|
+@cindex types, signed
|
|
|
+@cindex types, unsigned
|
|
|
+@findex signed
|
|
|
+@findex unsigned
|
|
|
+
|
|
|
+An unsigned integer type can represent only positive numbers and zero.
|
|
|
+A signed type can represent both positive and negative number, in a
|
|
|
+range spread almost equally on both sides of zero. For instance,
|
|
|
+@code{unsigned char} holds numbers from 0 to 255 (on most computers),
|
|
|
+while @code{signed char} holds numbers from @minus{}128 to 127. Each of
|
|
|
+these types holds 256 different possible values, since they are both 8
|
|
|
+bits wide.
|
|
|
+
|
|
|
+Write @code{signed} or @code{unsigned} before the type keyword to
|
|
|
+specify a signed or an unsigned type. However, the integer types
|
|
|
+other than @code{char} are signed by default; with them, @code{signed}
|
|
|
+is a no-op.
|
|
|
+
|
|
|
+Plain @code{char} may be signed or unsigned; this depends on the
|
|
|
+compiler, the machine in use, and its operating system.
|
|
|
+
|
|
|
+In many programs, it makes no difference whether @code{char} is
|
|
|
+signed. When it does matter, don't leave it to chance; write
|
|
|
+@code{signed char} or @code{unsigned char}.@footnote{Personal note from
|
|
|
+Richard Stallman: Eating with hackers at a fish restaurant, I ordered
|
|
|
+Arctic Char. When my meal arrived, I noted that the chef had not
|
|
|
+signed it. So I complained, ``This char is unsigned---I wanted a
|
|
|
+signed char!'' Or rather, I would have said this if I had thought of
|
|
|
+it fast enough.}
|
|
|
+
|
|
|
+@node Narrow Integers
|
|
|
+@subsection Narrow Integers
|
|
|
+
|
|
|
+The types that are narrower than @code{int} are rarely used for
|
|
|
+ordinary variables---we declare them @code{int} instead. This is
|
|
|
+because C converts those narrower types to @code{int} for any
|
|
|
+arithmetic. There is literally no reason to declare a local variable
|
|
|
+@code{char}, for instance.
|
|
|
+
|
|
|
+In particular, if the value is really a character, you should declare
|
|
|
+the variable @code{int}. Not @code{char}! Using that narrow type can
|
|
|
+force the compiler to truncate values for conversion, which is a
|
|
|
+waste. Furthermore, some functions return either a character value,
|
|
|
+or @minus{}1 for ``no character.'' Using @code{int} keeps those
|
|
|
+values distinct.
|
|
|
+
|
|
|
+The narrow integer types are useful as parts of other objects, such as
|
|
|
+arrays and structures. Compare these array declarations, whose sizes
|
|
|
+on 32-bit processors are shown:
|
|
|
+
|
|
|
+@example
|
|
|
+signed char ac[1000]; /* @r{1000 bytes} */
|
|
|
+short as[1000]; /* @r{2000 bytes} */
|
|
|
+int ai[1000]; /* @r{4000 bytes} */
|
|
|
+long long all[1000]; /* @r{8000 bytes} */
|
|
|
+@end example
|
|
|
+
|
|
|
+In addition, character strings must be made up of @code{char}s,
|
|
|
+because that's what all the standard library string functions expect.
|
|
|
+Thus, array @code{ac} could be used as a character string, but the
|
|
|
+others could not be.
|
|
|
+
|
|
|
+@node Integer Conversion
|
|
|
+@subsection Conversion among Integer Types
|
|
|
+
|
|
|
+C converts between integer types implicitly in many situations. It
|
|
|
+converts the narrow integer types, @code{char} and @code{short}, to
|
|
|
+@code{int} whenever they are used in arithmetic. Assigning a new
|
|
|
+value to an integer variable (or other lvalue) converts the value to
|
|
|
+the variable's type.
|
|
|
+
|
|
|
+You can also convert one integer type to another explicitly with a
|
|
|
+@dfn{cast} operator. @xref{Explicit Type Conversion}.
|
|
|
+
|
|
|
+The process of conversion to a wider type is straightforward: the
|
|
|
+value is unchanged. The only exception is when converting a negative
|
|
|
+value (in a signed type, obviously) to a wider unsigned type. In that
|
|
|
+case, the result is a positive value with the same bits
|
|
|
+(@pxref{Integers in Depth}).
|
|
|
+
|
|
|
+@cindex truncation
|
|
|
+Converting to a narrower type, also called @dfn{truncation}, involves
|
|
|
+discarding some of the value's bits. This is not considered overflow
|
|
|
+(@pxref{Integer Overflow}) because loss of significant bits is a
|
|
|
+normal consequence of truncation. Likewise for conversion between
|
|
|
+signed and unsigned types of the same width.
|
|
|
+
|
|
|
+More information about conversion for assignment is in
|
|
|
+@ref{Assignment Type Conversions}. For conversion for arithmetic,
|
|
|
+see @ref{Argument Promotions}.
|
|
|
+
|
|
|
+@node Boolean Type
|
|
|
+@subsection Boolean Type
|
|
|
+@cindex boolean type
|
|
|
+@cindex type, boolean
|
|
|
+@findex bool
|
|
|
+
|
|
|
+The unsigned integer type @code{bool} holds truth values: its possible
|
|
|
+values are 0 and 1. Converting any nonzero value to @code{bool}
|
|
|
+results in 1. For example:
|
|
|
+
|
|
|
+@example
|
|
|
+bool a = 0;
|
|
|
+bool b = 1;
|
|
|
+bool c = 4; /* @r{Stores the value 1 in @code{c}.} */
|
|
|
+@end example
|
|
|
+
|
|
|
+Unlike @code{int}, @code{bool} is not a keyword. It is defined in
|
|
|
+the header file @file{stdbool.h}.
|
|
|
+
|
|
|
+@node Integer Variations
|
|
|
+@subsection Integer Variations
|
|
|
+
|
|
|
+The integer types of C have standard @emph{names}, but what they
|
|
|
+@emph{mean} varies depending on the kind of platform in use:
|
|
|
+which kind of computer, which operating system, and which compiler.
|
|
|
+It may even depend on the compiler options used.
|
|
|
+
|
|
|
+Plain @code{char} may be signed or unsigned; this depends on the
|
|
|
+platform, too. Even for GNU C, there is no general rule.
|
|
|
+
|
|
|
+In theory, all of the integer types' sizes can vary. @code{char} is
|
|
|
+always considered one ``byte'' for C, but it is not necessarily an
|
|
|
+8-bit byte; on some platforms it may be more than 8 bits. ISO C
|
|
|
+specifies only that none of these types is narrower than the ones
|
|
|
+above it in the list in @ref{Basic Integers}, and that @code{short}
|
|
|
+has at least 16 bits.
|
|
|
+
|
|
|
+It is possible that in the future GNU C will support platforms where
|
|
|
+@code{int} is 64 bits long. In practice, however, on today's real
|
|
|
+computers, there is little variation; you can rely on the table
|
|
|
+given previously (@pxref{Basic Integers}).
|
|
|
+
|
|
|
+To be completely sure of the size of an integer type,
|
|
|
+use the types @code{int16_t}, @code{int32_t} and @code{int64_t}.
|
|
|
+Their corresponding unsigned types add @samp{u} at the front.
|
|
|
+To define these, include the header file @file{stdint.h}.
|
|
|
+
|
|
|
+The GNU C Compiler compiles for some embedded controllers that use two
|
|
|
+bytes for @code{int}. On some, @code{int} is just one ``byte,'' and
|
|
|
+so is @code{short int}---but that ``byte'' may contain 16 bits or even
|
|
|
+32 bits. These processors can't support an ordinary operating system
|
|
|
+(they may have their own specialized operating systems), and most C
|
|
|
+programs do not try to support them.
|
|
|
+
|
|
|
+@node Floating-Point Data Types
|
|
|
+@section Floating-Point Data Types
|
|
|
+@cindex floating-point types
|
|
|
+@cindex types, floating-point
|
|
|
+@findex double
|
|
|
+@findex float
|
|
|
+@findex long double
|
|
|
+
|
|
|
+@dfn{Floating point} is the binary analogue of scientific notation:
|
|
|
+internally it represents a number as a fraction and a binary exponent; the
|
|
|
+value is that fraction multiplied by the specified power of 2.
|
|
|
+
|
|
|
+For instance, to represent 6, the fraction would be 0.75 and the
|
|
|
+exponent would be 3; together they stand for the value @math{0.75 * 2@sup{3}},
|
|
|
+meaning 0.75 * 8. The value 1.5 would use 0.75 as the fraction and 1
|
|
|
+as the exponent. The value 0.75 would use 0.75 as the fraction and 0
|
|
|
+as the exponent. The value 0.375 would use 0.75 as the fraction and
|
|
|
+-1 as the exponent.
|
|
|
+
|
|
|
+These binary exponents are used by machine instructions. You can
|
|
|
+write a floating-point constant this way if you wish, using
|
|
|
+hexadecimal; but normally we write floating-point numbers in decimal.
|
|
|
+@xref{Floating Constants}.
|
|
|
+
|
|
|
+C has three floating-point data types:
|
|
|
+
|
|
|
+@table @code
|
|
|
+@item double
|
|
|
+``Double-precision'' floating point, which uses 64 bits. This is the
|
|
|
+normal floating-point type, and modern computers normally do
|
|
|
+their floating-point computations in this type, or some wider type.
|
|
|
+Except when there is a special reason to do otherwise, this is the
|
|
|
+type to use for floating-point values.
|
|
|
+
|
|
|
+@item float
|
|
|
+``Single-precision'' floating point, which uses 32 bits. It is useful
|
|
|
+for floating-point values stored in structures and arrays, to save
|
|
|
+space when the full precision of @code{double} is not needed. In
|
|
|
+addition, single-precision arithmetic is faster on some computers, and
|
|
|
+occasionally that is useful. But not often---most programs don't use
|
|
|
+the type @code{float}.
|
|
|
+
|
|
|
+C would be cleaner if @code{float} were the name of the type we
|
|
|
+use for most floating-point values; however, for historical reasons,
|
|
|
+that's not so.
|
|
|
+
|
|
|
+@item long double
|
|
|
+``Extended-precision'' floating point is either 80-bit or 128-bit
|
|
|
+precision, depending on the machine in use. On some machines, which
|
|
|
+have no floating-point format wider than @code{double}, this is
|
|
|
+equivalent to @code{double}.
|
|
|
+@end table
|
|
|
+
|
|
|
+Floating-point arithmetic raises many subtle issues. @xref{Floating
|
|
|
+Point in Depth}, for more information.
|
|
|
+
|
|
|
+@node Complex Data Types
|
|
|
+@section Complex Data Types
|
|
|
+@cindex complex numbers
|
|
|
+@cindex types, complex
|
|
|
+@cindex @code{_Complex} keyword
|
|
|
+@cindex @code{__complex__} keyword
|
|
|
+@findex _Complex
|
|
|
+@findex __complex__
|
|
|
+
|
|
|
+Complex numbers can include both a real part and an imaginary part.
|
|
|
+The numeric constants covered above have real-numbered values. An
|
|
|
+imaginary-valued constant is an ordinary real-valued constant followed
|
|
|
+by @samp{i}.
|
|
|
+
|
|
|
+To declare numeric variables as complex, use the @code{_Complex}
|
|
|
+keyword.@footnote{For compatibility with older versions of GNU C, the
|
|
|
+keyword @code{__complex__} is also allowed. Going forward, however,
|
|
|
+use the new @code{_Complex} keyword as defined in ISO C11.} The
|
|
|
+standard C complex data types are floating point,
|
|
|
+
|
|
|
+@example
|
|
|
+_Complex float foo;
|
|
|
+_Complex double bar;
|
|
|
+_Complex long double quux;
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+but GNU C supports integer complex types as well.
|
|
|
+
|
|
|
+Since @code{_Complex} is a keyword just like @code{float} and
|
|
|
+@code{double} and @code{long}, the keywords can appear in any order,
|
|
|
+but the order shown above seems most logical.
|
|
|
+
|
|
|
+GNU C supports constants for complex values; for instance, @code{4.0 +
|
|
|
+3.0i} has the value 4 + 3i as type @code{_Complex double}.
|
|
|
+@xref{Imaginary Constants}.
|
|
|
+
|
|
|
+To pull the real and imaginary parts of the number back out, GNU C
|
|
|
+provides the keywords @code{__real__} and @code{__imag__}:
|
|
|
+
|
|
|
+@example
|
|
|
+_Complex double foo = 4.0 + 3.0i;
|
|
|
+
|
|
|
+double a = __real__ foo; /* @r{@code{a} is now 4.0.} */
|
|
|
+double b = __imag__ foo; /* @r{@code{b} is now 3.0.} */
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+Standard C does not include these keywords, and instead relies on
|
|
|
+functions defined in @code{complex.h} for accessing the real and
|
|
|
+imaginary parts of a complex number: @code{crealf}, @code{creal}, and
|
|
|
+@code{creall} extract the real part of a float, double, or long double
|
|
|
+complex number, respectively; @code{cimagf}, @code{cimag}, and
|
|
|
+@code{cimagl} extract the imaginary part.
|
|
|
+
|
|
|
+@cindex complex conjugation
|
|
|
+GNU C also defines @samp{~} as an operator for complex conjugation,
|
|
|
+which means negating the imaginary part of a complex number:
|
|
|
+
|
|
|
+@example
|
|
|
+_Complex double foo = 4.0 + 3.0i;
|
|
|
+_Complex double bar = ~foo; /* @r{@code{bar} is now 4 @minus{} 3i.} */
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+For standard C compatibility, you can use the appropriate library
|
|
|
+function: @code{conjf}, @code{conj}, or @code{confl}.
|
|
|
+
|
|
|
+@node The Void Type
|
|
|
+@section The Void Type
|
|
|
+@cindex void type
|
|
|
+@cindex type, void
|
|
|
+@findex void
|
|
|
+
|
|
|
+The data type @code{void} is a dummy---it allows no operations. It
|
|
|
+really means ``no value at all.'' When a function is meant to return
|
|
|
+no value, we write @code{void} for its return type. Then
|
|
|
+@code{return} statements in that function should not specify a value
|
|
|
+(@pxref{return Statement}). Here's an example:
|
|
|
+
|
|
|
+@example
|
|
|
+void
|
|
|
+print_if_positive (double x, double y)
|
|
|
+@{
|
|
|
+ if (x <= 0)
|
|
|
+ return;
|
|
|
+ if (y <= 0)
|
|
|
+ return;
|
|
|
+ printf ("Next point is (%f,%f)\n", x, y);
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+A @code{void}-returning function is comparable to what some other languages
|
|
|
+call a ``procedure'' instead of a ``function.''
|
|
|
+
|
|
|
+@c ??? Already presented
|
|
|
+@c @samp{%f} in an output template specifies to format a @code{double} value
|
|
|
+@c as a decimal number, using a decimal point if needed.
|
|
|
+
|
|
|
+@node Other Data Types
|
|
|
+@section Other Data Types
|
|
|
+
|
|
|
+Beyond the primitive types, C provides several ways to construct new
|
|
|
+data types. For instance, you can define @dfn{pointers}, values that
|
|
|
+represent the addresses of other data (@pxref{Pointers}). You can
|
|
|
+define @dfn{structures}, as in many other languages
|
|
|
+(@pxref{Structures}), and @dfn{unions}, which specify multiple ways
|
|
|
+to look at the same memory space (@pxref{Unions}). @dfn{Enumerations}
|
|
|
+are collections of named integer codes (@pxref{Enumeration Types}).
|
|
|
+
|
|
|
+@dfn{Array types} in C are used for allocating space for objects,
|
|
|
+but C does not permit operating on an array value as a whole. @xref{Arrays}.
|
|
|
+
|
|
|
+@node Type Designators
|
|
|
+@section Type Designators
|
|
|
+@cindex type designator
|
|
|
+
|
|
|
+Some C constructs require a way to designate a specific data type
|
|
|
+independent of any particular variable or expression which has that
|
|
|
+type. The way to do this is with a @dfn{type designator}. The
|
|
|
+constucts that need one include casts (@pxref{Explicit Type
|
|
|
+Conversion}) and @code{sizeof} (@pxref{Type Size}).
|
|
|
+
|
|
|
+We also use type designators to talk about the type of a value in C,
|
|
|
+so you will see many type designators in this manual. When we say,
|
|
|
+``The value has type @code{int},'' @code{int} is a type designator.
|
|
|
+
|
|
|
+To make the designator for any type, imagine a variable declaration
|
|
|
+for a variable of that type and delete the variable name and the final
|
|
|
+semicolon.
|
|
|
+
|
|
|
+For example, to designate the type of full-word integers, we start
|
|
|
+with the declaration for a variable @code{foo} with that type,
|
|
|
+which is this:
|
|
|
+
|
|
|
+@example
|
|
|
+int foo;
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+Then we delete the variable name @code{foo} and the semicolon, leaving
|
|
|
+@code{int}---exactly the keyword used in such a declaration.
|
|
|
+Therefore, the type designator for this type is @code{int}.
|
|
|
+
|
|
|
+What about long unsigned integers? From the declaration
|
|
|
+
|
|
|
+@example
|
|
|
+unsigned long int foo;
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+we determine that the designator is @code{unsigned long int}.
|
|
|
+
|
|
|
+Following this procedure, the designator for any primitive type is
|
|
|
+simply the set of keywords which specifies that type in a declaration.
|
|
|
+The same is true for compound types such as structures, unions, and
|
|
|
+enumerations.
|
|
|
+
|
|
|
+Designators for pointer types do follow the rule of deleting the
|
|
|
+variable name and semicolon, but the result is not so simple.
|
|
|
+@xref{Pointer Type Designators}, as part of the chapter about
|
|
|
+pointers. @xref{Array Type Designators}), for designators for array
|
|
|
+types.
|
|
|
+
|
|
|
+To understand what type a designator stands for, imagine a variable
|
|
|
+name inserted into the right place in the designator to make a valid
|
|
|
+declaration. What type would that variable be declared as? That is the
|
|
|
+type the designator designates.
|
|
|
+
|
|
|
+@node Constants
|
|
|
+@chapter Constants
|
|
|
+@cindex constants
|
|
|
+
|
|
|
+A @dfn{constant} is an expression that stands for a specific value by
|
|
|
+explicitly representing the desired value. C allows constants for
|
|
|
+numbers, characters, and strings. We have already seen numeric and
|
|
|
+string constants in the examples.
|
|
|
+
|
|
|
+@menu
|
|
|
+* Integer Constants:: Literal integer values.
|
|
|
+* Integer Const Type:: Types of literal integer values.
|
|
|
+* Floating Constants:: Literal floating-point values.
|
|
|
+* Imaginary Constants:: Literal imaginary number values.
|
|
|
+* Invalid Numbers:: Avoiding preprocessing number misconceptions.
|
|
|
+* Character Constants:: Literal character values.
|
|
|
+* String Constants:: Literal string values.
|
|
|
+* UTF-8 String Constants:: Literal UTF-8 string values.
|
|
|
+* Unicode Character Codes:: Unicode characters represented
|
|
|
+ in either UTF-16 or UTF-32.
|
|
|
+* Wide Character Constants:: Literal characters values larger than 8 bits.
|
|
|
+* Wide String Constants:: Literal string values made up of
|
|
|
+ 16- or 32-bit characters.
|
|
|
+@end menu
|
|
|
+
|
|
|
+@node Integer Constants
|
|
|
+@section Integer Constants
|
|
|
+@cindex integer constants
|
|
|
+@cindex constants, integer
|
|
|
+
|
|
|
+An integer constant consists of a number to specify the value,
|
|
|
+followed optionally by suffix letters to specify the data type.
|
|
|
+
|
|
|
+The simplest integer constants are numbers written in base 10
|
|
|
+(decimal), such as @code{5}, @code{77}, and @code{403}. A decimal
|
|
|
+constant cannot start with the character @samp{0} (zero) because
|
|
|
+that makes the constant octal.
|
|
|
+
|
|
|
+You can get the effect of a negative integer constant by putting a
|
|
|
+minus sign at the beginning. Grammatically speaking, that is an
|
|
|
+arithmetic expression rather than a constant, but it behaves just like
|
|
|
+a true constant.
|
|
|
+
|
|
|
+Integer constants can also be written in octal (base 8), hexadecimal
|
|
|
+(base 16), or binary (base 2). An octal constant starts with the
|
|
|
+character @samp{0} (zero), followed by any number of octal digits
|
|
|
+(@samp{0} to @samp{7}):
|
|
|
+
|
|
|
+@example
|
|
|
+0 // @r{zero}
|
|
|
+077 // @r{63}
|
|
|
+0403 // @r{259}
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+Pedantically speaking, the constant @code{0} is an octal constant, but
|
|
|
+we can think of it as decimal; it has the same value either way.
|
|
|
+
|
|
|
+A hexadecimal constant starts with @samp{0x} (upper or lower case)
|
|
|
+followed by hex digits (@samp{0} to @samp{9}, as well as @samp{a}
|
|
|
+through @samp{f} in upper or lower case):
|
|
|
+
|
|
|
+@example
|
|
|
+0xff // @r{255}
|
|
|
+0XA0 // @r{160}
|
|
|
+0xffFF // @r{65535}
|
|
|
+@end example
|
|
|
+
|
|
|
+@cindex binary integer constants
|
|
|
+A binary constant starts with @samp{0b} (upper or lower case) followed
|
|
|
+by bits (each represented by the characters @samp{0} or @samp{1}):
|
|
|
+
|
|
|
+@example
|
|
|
+0b101 // @r{5}
|
|
|
+@end example
|
|
|
+
|
|
|
+Binary constants are a GNU C extension, not part of the C standard.
|
|
|
+
|
|
|
+Sometimes a space is needed after an integer constant to avoid
|
|
|
+lexical confusion with the following tokens. @xref{Invalid Numbers}.
|
|
|
+
|
|
|
+@node Integer Const Type
|
|
|
+@section Integer Constant Data Types
|
|
|
+@cindex integer constant data types
|
|
|
+@cindex constant data types, integer
|
|
|
+@cindex types of integer constants
|
|
|
+
|
|
|
+The type of an integer constant is normally @code{int}, if the value
|
|
|
+fits in that type, but here are the complete rules. The type
|
|
|
+of an integer constant is the first one in this sequence that can
|
|
|
+properly represent the value,
|
|
|
+
|
|
|
+@enumerate
|
|
|
+@item
|
|
|
+@code{int}
|
|
|
+@item
|
|
|
+@code{unsigned int}
|
|
|
+@item
|
|
|
+@code{long int}
|
|
|
+@item
|
|
|
+@code{unsigned long int}
|
|
|
+@item
|
|
|
+@code{long long int}
|
|
|
+@item
|
|
|
+@code{unsigned long long int}
|
|
|
+@end enumerate
|
|
|
+
|
|
|
+@noindent
|
|
|
+and that isn't excluded by the following rules.
|
|
|
+
|
|
|
+If the constant has @samp{l} or @samp{L} as a suffix, that excludes the
|
|
|
+first two types (non-@code{long}).
|
|
|
+
|
|
|
+If the constant has @samp{ll} or @samp{LL} as a suffix, that excludes
|
|
|
+first four types (non-@code{long long}).
|
|
|
+
|
|
|
+If the constant has @samp{u} or @samp{U} as a suffix, that excludes
|
|
|
+the signed types.
|
|
|
+
|
|
|
+Otherwise, if the constant is decimal, that excludes the unsigned
|
|
|
+types.
|
|
|
+@c ### This said @code{unsigned int} is excluded.
|
|
|
+@c ### See 17 April 2016
|
|
|
+
|
|
|
+Here are some examples of the suffixes.
|
|
|
+
|
|
|
+@example
|
|
|
+3000000000u // @r{three billion as @code{unsigned int}.}
|
|
|
+0LL // @r{zero as a @code{long long int}.}
|
|
|
+0403l // @r{259 as a @code{long int}.}
|
|
|
+@end example
|
|
|
+
|
|
|
+Suffixes in integer constants are rarely used. When the precise type
|
|
|
+is important, it is cleaner to convert explicitly (@pxref{Explicit
|
|
|
+Type Conversion}).
|
|
|
+
|
|
|
+@xref{Integer Types}.
|
|
|
+
|
|
|
+@node Floating Constants
|
|
|
+@section Floating-Point Constants
|
|
|
+@cindex floating-point constants
|
|
|
+@cindex constants, floating-point
|
|
|
+
|
|
|
+A floating-point constant must have either a decimal point, an
|
|
|
+exponent-of-ten, or both; they distinguish it from an integer
|
|
|
+constant.
|
|
|
+
|
|
|
+To indicate an exponent, write @samp{e} or @samp{E}. The exponent
|
|
|
+value follows. It is always written as a decimal number; it can
|
|
|
+optionally start with a sign. The exponent @var{n} means to multiply
|
|
|
+the constant's value by ten to the @var{n}th power.
|
|
|
+
|
|
|
+Thus, @samp{1500.0}, @samp{15e2}, @samp{15e+2}, @samp{15.0e2},
|
|
|
+@samp{1.5e+3}, @samp{.15e4}, and @samp{15000e-1} are six ways of
|
|
|
+writing a floating-point number whose value is 1500. They are all
|
|
|
+equivalent.
|
|
|
+
|
|
|
+Here are more examples with decimal points:
|
|
|
+
|
|
|
+@example
|
|
|
+1.0
|
|
|
+1000.
|
|
|
+3.14159
|
|
|
+.05
|
|
|
+.0005
|
|
|
+@end example
|
|
|
+
|
|
|
+For each of them, here are some equivalent constants written with
|
|
|
+exponents:
|
|
|
+
|
|
|
+@example
|
|
|
+1e0, 1.0000e0
|
|
|
+100e1, 100e+1, 100E+1, 1e3, 10000e-1
|
|
|
+3.14159e0
|
|
|
+5e-2, .0005e+2, 5E-2, .0005E2
|
|
|
+.05e-2
|
|
|
+@end example
|
|
|
+
|
|
|
+A floating-point constant normally has type @code{double}. You can
|
|
|
+force it to type @code{float} by adding @samp{f} or @samp{F}
|
|
|
+at the end. For example,
|
|
|
+
|
|
|
+@example
|
|
|
+3.14159f
|
|
|
+3.14159e0f
|
|
|
+1000.f
|
|
|
+100E1F
|
|
|
+.0005f
|
|
|
+.05e-2f
|
|
|
+@end example
|
|
|
+
|
|
|
+Likewise, @samp{l} or @samp{L} at the end forces the constant
|
|
|
+to type @code{long double}.
|
|
|
+
|
|
|
+You can use exponents in hexadecimal floating constants, but since
|
|
|
+@samp{e} would be interpreted as a hexadecimal digit, the character
|
|
|
+@samp{p} or @samp{P} (for ``power'') indicates an exponent.
|
|
|
+
|
|
|
+The exponent in a hexadecimal floating constant is a possibly-signed
|
|
|
+decimal integer that specifies a power of 2 (@emph{not} 10 or 16) to
|
|
|
+multiply into the number.
|
|
|
+
|
|
|
+Here are some examples:
|
|
|
+
|
|
|
+@example
|
|
|
+@group
|
|
|
+0xAp2 // @r{40 in decimal}
|
|
|
+0xAp-1 // @r{5 in decimal}
|
|
|
+0x2.0Bp4 // @r{16.75 decimal}
|
|
|
+0xE.2p3 // @r{121 decimal}
|
|
|
+0x123.ABCp0 // @r{291.6708984375 in decimal}
|
|
|
+0x123.ABCp4 // @r{4666.734375 in decimal}
|
|
|
+0x100p-8 // @r{1}
|
|
|
+0x10p-4 // @r{1}
|
|
|
+0x1p+4 // @r{16}
|
|
|
+0x1p+8 // @r{256}
|
|
|
+@end group
|
|
|
+@end example
|
|
|
+
|
|
|
+@xref{Floating-Point Data Types}.
|
|
|
+
|
|
|
+@node Imaginary Constants
|
|
|
+@section Imaginary Constants
|
|
|
+@cindex imaginary constants
|
|
|
+@cindex complex constants
|
|
|
+@cindex constants, imaginary
|
|
|
+
|
|
|
+A complex number consists of a real part plus an imaginary part.
|
|
|
+(Either or both parts may be zero.) This section explains how to
|
|
|
+write numeric constants with imaginary values. By adding these to
|
|
|
+ordinary real-valued numeric constants, we can make constants with
|
|
|
+complex values.
|
|
|
+
|
|
|
+The simple way to write an imaginary-number constant is to attach the
|
|
|
+suffix @samp{i} or @samp{I}, or @samp{j} or @samp{J}, to an integer or
|
|
|
+floating-point constant. For example, @code{2.5fi} has type
|
|
|
+@code{_Complex float} and @code{3i} has type @code{_Complex int}.
|
|
|
+The four alternative suffix letters are all equivalent.
|
|
|
+
|
|
|
+@cindex _Complex_I
|
|
|
+The other way to write an imaginary constant is to multiply a real
|
|
|
+constant by @code{_Complex_I}, which represents the imaginary number
|
|
|
+i. Standard C doesn't support suffixing with @samp{i} or @samp{j}, so
|
|
|
+this clunky way is needed.
|
|
|
+
|
|
|
+To write a complex constant with a nonzero real part and a nonzero
|
|
|
+imaginary part, write the two separately and add them, like this:
|
|
|
+
|
|
|
+@example
|
|
|
+4.0 + 3.0i
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+That gives the value 4 + 3i, with type @code{_Complex double}.
|
|
|
+
|
|
|
+Such a sum can include multiple real constants, or none. Likewise, it
|
|
|
+can include multiple imaginary constants, or none. For example:
|
|
|
+
|
|
|
+@example
|
|
|
+_Complex double foo, bar, quux;
|
|
|
+
|
|
|
+foo = 2.0i + 4.0 + 3.0i; /* @r{Imaginary part is 5.0.} */
|
|
|
+bar = 4.0 + 12.0; /* @r{Imaginary part is 0.0.} */
|
|
|
+quux = 3.0i + 15.0i; /* @r{Real part is 0.0.} */
|
|
|
+@end example
|
|
|
+
|
|
|
+@xref{Complex Data Types}.
|
|
|
+
|
|
|
+@node Invalid Numbers
|
|
|
+@section Invalid Numbers
|
|
|
+
|
|
|
+Some number-like constructs which are not really valid as numeric
|
|
|
+constants are treated as numbers in preprocessing directives. If
|
|
|
+these constructs appear outside of preprocessing, they are erroneous.
|
|
|
+@xref{Preprocessing Tokens}.
|
|
|
+
|
|
|
+Sometimes we need to insert spaces to separate tokens so that they
|
|
|
+won't be combined into a single number-like construct. For example,
|
|
|
+@code{0xE+12} is a preprocessing number that is not a valid numeric
|
|
|
+constant, so it is a syntax error. If what we want is the three
|
|
|
+tokens @code{@w{0xE + 12}}, we have to use those spaces as separators.
|
|
|
+
|
|
|
+@node Character Constants
|
|
|
+@section Character Constants
|
|
|
+@cindex character constants
|
|
|
+@cindex constants, character
|
|
|
+@cindex escape sequence
|
|
|
+
|
|
|
+A @dfn{character constant} is written with single quotes, as in
|
|
|
+@code{'@var{c}'}. In the simplest case, @var{c} is a single ASCII
|
|
|
+character that the constant should represent. The constant has type
|
|
|
+@code{int}, and its value is the character code of that character.
|
|
|
+For instance, @code{'a'} represents the character code for the letter
|
|
|
+@samp{a}: 97, that is.
|
|
|
+
|
|
|
+To put the @samp{'} character (single quote) in the character
|
|
|
+constant, @dfn{quote} it with a backslash (@samp{\}). This character
|
|
|
+constant looks like @code{'\''}. This sort of sequence, starting with
|
|
|
+@samp{\}, is called an @dfn{escape sequence}---the backslash character
|
|
|
+here functions as a kind of @dfn{escape character}.
|
|
|
+
|
|
|
+To put the @samp{\} character (backslash) in the character constant,
|
|
|
+quote it likewise with @samp{\} (another backslash). This character
|
|
|
+constant looks like @code{'\\'}.
|
|
|
+
|
|
|
+@cindex bell character
|
|
|
+@cindex @samp{\a}
|
|
|
+@cindex backspace
|
|
|
+@cindex @samp{\b}
|
|
|
+@cindex tab (ASCII character)
|
|
|
+@cindex @samp{\t}
|
|
|
+@cindex vertical tab
|
|
|
+@cindex @samp{\v}
|
|
|
+@cindex formfeed
|
|
|
+@cindex @samp{\f}
|
|
|
+@cindex newline
|
|
|
+@cindex @samp{\n}
|
|
|
+@cindex return (ASCII character)
|
|
|
+@cindex @samp{\r}
|
|
|
+@cindex escape (ASCII character)
|
|
|
+@cindex @samp{\e}
|
|
|
+Here are all the escape sequences that represent specific
|
|
|
+characters in a character constant. The numeric values shown are
|
|
|
+the corresponding ASCII character codes, as decimal numbers.
|
|
|
+
|
|
|
+@example
|
|
|
+'\a' @result{} 7 /* @r{alarm, @kbd{CTRL-g}} */
|
|
|
+'\b' @result{} 8 /* @r{backspace, @key{BS}, @kbd{CTRL-h}} */
|
|
|
+'\t' @result{} 9 /* @r{tab, @key{TAB}, @kbd{CTRL-i}} */
|
|
|
+'\n' @result{} 10 /* @r{newline, @kbd{CTRL-j}} */
|
|
|
+'\v' @result{} 11 /* @r{vertical tab, @kbd{CTRL-k}} */
|
|
|
+'\f' @result{} 12 /* @r{formfeed, @kbd{CTRL-l}} */
|
|
|
+'\r' @result{} 13 /* @r{carriage return, @key{RET}, @kbd{CTRL-m}} */
|
|
|
+'\e' @result{} 27 /* @r{escape character, @key{ESC}, @kbd{CTRL-[}} */
|
|
|
+'\\' @result{} 92 /* @r{backslash character, @kbd{\}} */
|
|
|
+'\'' @result{} 39 /* @r{singlequote character, @kbd{'}} */
|
|
|
+'\"' @result{} 34 /* @r{doublequote character, @kbd{"}} */
|
|
|
+'\?' @result{} 63 /* @r{question mark, @kbd{?}} */
|
|
|
+@end example
|
|
|
+
|
|
|
+@samp{\e} is a GNU C extension; to stick to standard C, write @samp{\33}.
|
|
|
+
|
|
|
+You can also write octal and hex character codes as
|
|
|
+@samp{\@var{octalcode}} or @samp{\x@var{hexcode}}. Decimal is not an
|
|
|
+option here, so octal codes do not need to start with @samp{0}.
|
|
|
+
|
|
|
+The character constant's value has type @code{int}. However, the
|
|
|
+character code is treated initially as a @code{char} value, which is
|
|
|
+then converted to @code{int}. If the character code is greater than
|
|
|
+127 (@code{0177} in octal), the resulting @code{int} may be negative
|
|
|
+on a platform where the type @code{char} is 8 bits long and signed.
|
|
|
+
|
|
|
+@node String Constants
|
|
|
+@section String Constants
|
|
|
+@cindex string constants
|
|
|
+@cindex constants, string
|
|
|
+
|
|
|
+A @dfn{string constant} represents a series of characters. It starts
|
|
|
+with @samp{"} and ends with @samp{"}; in between are the contents of
|
|
|
+the string. Quoting special characters such as @samp{"}, @samp{\} and
|
|
|
+newline in the contents works in string constants as in character
|
|
|
+constants. In a string constant, @samp{'} does not need to be quoted.
|
|
|
+
|
|
|
+A string constant defines an array of characters which contains the
|
|
|
+specified characters followed by the null character (code 0). Using
|
|
|
+the string constant is equivalent to using the name of an array with
|
|
|
+those contents. In simple cases, the length in bytes of the string
|
|
|
+constant is one greater than the number of characters written in it.
|
|
|
+
|
|
|
+As with any array in C, using the string constant in an expression
|
|
|
+converts the array to a pointer (@pxref{Pointers}) to the array's
|
|
|
+first element (@pxref{Accessing Array Elements}). This pointer will
|
|
|
+have type @code{char *} because it points to an element of type
|
|
|
+@code{char}. @code{char *} is an example of a type designator for a
|
|
|
+pointer type (@pxref{Pointer Type Designators}). That type is used
|
|
|
+for strings generally, not just the strings expressed as constants
|
|
|
+in a program.
|
|
|
+
|
|
|
+Thus, the string constant @code{"Foo!"} is almost
|
|
|
+equivalent to declaring an array like this
|
|
|
+
|
|
|
+@example
|
|
|
+char string_array_1[] = @{'F', 'o', 'o', '!', '\0' @};
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+and then using @code{string_array_1} in the program. There
|
|
|
+are two differences, however:
|
|
|
+
|
|
|
+@itemize @bullet
|
|
|
+@item
|
|
|
+The string constant doesn't define a name for the array.
|
|
|
+
|
|
|
+@item
|
|
|
+The string constant is probably stored in a read-only area of memory.
|
|
|
+@end itemize
|
|
|
+
|
|
|
+Newlines are not allowed in the text of a string constant. The motive
|
|
|
+for this prohibition is to catch the error of omitting the closing
|
|
|
+@samp{"}. To put a newline in a constant string, write it as
|
|
|
+@samp{\n} in the string constant.
|
|
|
+
|
|
|
+A real null character in the source code inside a string constant
|
|
|
+causes a warning. To put a null character in the middle of a string
|
|
|
+constant, write @samp{\0} or @samp{\000}.
|
|
|
+
|
|
|
+Consecutive string constants are effectively concatenated. Thus,
|
|
|
+
|
|
|
+@example
|
|
|
+"Fo" "o!" @r{is equivalent to} "Foo!"
|
|
|
+@end example
|
|
|
+
|
|
|
+This is useful for writing a string containing multiple lines,
|
|
|
+like this:
|
|
|
+
|
|
|
+@example
|
|
|
+"This message is so long that it needs more than\n"
|
|
|
+"a single line of text. C does not allow a newline\n"
|
|
|
+"to represent itself in a string constant, so we have to\n"
|
|
|
+"write \\n to put it in the string. For readability of\n"
|
|
|
+"the source code, it is advisable to put line breaks in\n"
|
|
|
+"the source where they occur in the contents of the\n"
|
|
|
+"constant.\n"
|
|
|
+@end example
|
|
|
+
|
|
|
+The sequence of a backslash and a newline is ignored anywhere
|
|
|
+in a C program, and that includes inside a string constant.
|
|
|
+Thus, you can write multi-line string constants this way:
|
|
|
+
|
|
|
+@example
|
|
|
+"This is another way to put newlines in a string constant\n\
|
|
|
+and break the line after them in the source code."
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+However, concatenation is the recommended way to do this.
|
|
|
+
|
|
|
+You can also write perverse string constants like this,
|
|
|
+
|
|
|
+@example
|
|
|
+"Fo\
|
|
|
+o!"
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+but don't do that---write it like this instead:
|
|
|
+
|
|
|
+@example
|
|
|
+"Foo!"
|
|
|
+@end example
|
|
|
+
|
|
|
+Be careful to avoid passing a string constant to a function that
|
|
|
+modifies the string it receives. The memory where the string constant
|
|
|
+is stored may be read-only, which would cause a fatal @code{SIGSEGV}
|
|
|
+signal that normally terminates the function (@pxref{Signals}. Even
|
|
|
+worse, the memory may not be read-only. Then the function might
|
|
|
+modify the string constant, thus spoiling the contents of other string
|
|
|
+constants that are supposed to contain the same value and are unified
|
|
|
+by the compiler.
|
|
|
+
|
|
|
+@node UTF-8 String Constants
|
|
|
+@section UTF-8 String Constants
|
|
|
+@cindex UTF-8 String Constants
|
|
|
+
|
|
|
+Writing @samp{u8} immediately before a string constant, with no
|
|
|
+intervening space, means to represent that string in UTF-8 encoding as
|
|
|
+a sequence of bytes. UTF-8 represents ASCII characters with a single
|
|
|
+byte, and represents non-ASCII Unicode characters (codes 128 and up)
|
|
|
+as multibyte sequences. Here is an example of a UTF-8 constant:
|
|
|
+
|
|
|
+@example
|
|
|
+u8"A cónstàñt"
|
|
|
+@end example
|
|
|
+
|
|
|
+This constant occupies 13 bytes plus the terminating null,
|
|
|
+because each of the accented letters is a two-byte sequence.
|
|
|
+
|
|
|
+Concatenating an ordinary string with a UTF-8 string conceptually
|
|
|
+produces another UTF-8 string. However, if the ordinary string
|
|
|
+contains character codes 128 and up, the results cannot be relied on.
|
|
|
+
|
|
|
+@node Unicode Character Codes
|
|
|
+@section Unicode Character Codes
|
|
|
+@cindex Unicode character codes
|
|
|
+@cindex universal character names
|
|
|
+
|
|
|
+You can specify Unicode characters, for individual character constants
|
|
|
+or as part of string constants (@pxref{String Constants}), using
|
|
|
+escape sequences. Use the @samp{\u} escape sequence with a 16-bit
|
|
|
+hexadecimal Unicode character code. If the code value is too big for
|
|
|
+16 bits, use the @samp{\U} escape sequence with a 32-bit hexadecimal
|
|
|
+Unicode character code. (These codes are called @dfn{universal
|
|
|
+character names}.) For example,
|
|
|
+
|
|
|
+@example
|
|
|
+\u6C34 /* @r{16-bit code (UTF-16)} */
|
|
|
+\U0010ABCD /* @r{32-bit code (UTF-32)} */
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+One way to use these is in UTF-8 string constants (@pxref{UTF-8 String
|
|
|
+Constants}). For instance,
|
|
|
+
|
|
|
+@example
|
|
|
+u8"fóó \u6C34 \U0010ABCD"
|
|
|
+@end example
|
|
|
+
|
|
|
+ You can also use them in wide character constants (@pxref{Wide
|
|
|
+Character Constants}), like this:
|
|
|
+
|
|
|
+@example
|
|
|
+u'\u6C34' /* @r{16-bit code} */
|
|
|
+U'\U0010ABCD' /* @r{32-bit code} */
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+and in wide string constants (@pxref{Wide String Constants}), like
|
|
|
+this:
|
|
|
+
|
|
|
+@example
|
|
|
+u"\u6C34\u6C33" /* @r{16-bit code} */
|
|
|
+U"\U0010ABCD" /* @r{32-bit code} */
|
|
|
+@end example
|
|
|
+
|
|
|
+Codes in the range of @code{D800} through @code{DFFF} are not valid
|
|
|
+in Unicode. Codes less than @code{00A0} are also forbidden, except for
|
|
|
+@code{0024}, @code{0040}, and @code{0060}; these characters are
|
|
|
+actually ASCII control characters, and you can specify them with other
|
|
|
+escape sequences (@pxref{Character Constants}).
|
|
|
+
|
|
|
+@node Wide Character Constants
|
|
|
+@section Wide Character Constants
|
|
|
+@cindex wide character constants
|
|
|
+@cindex constants, wide character
|
|
|
+
|
|
|
+A @dfn{wide character constant} represents characters with more than 8
|
|
|
+bits of character code. This is an obscure feature that we need to
|
|
|
+document but that you probably won't ever use. If you're just
|
|
|
+learning C, you may as well skip this section.
|
|
|
+
|
|
|
+The original C wide character constant looks like @samp{L} (upper
|
|
|
+case!) followed immediately by an ordinary character constant (with no
|
|
|
+intervening space). Its data type is @code{wchar_t}, which is an
|
|
|
+alias defined in @file{stddef.h} for one of the standard integer
|
|
|
+types. Depending on the platform, it could be 16 bits or 32 bits. If
|
|
|
+it is 16 bits, these character constants use the UTF-16 form of
|
|
|
+Unicode; if 32 bits, UTF-32.
|
|
|
+
|
|
|
+There are also Unicode wide character constants which explicitly
|
|
|
+specify the width. These constants start with @samp{u} or @samp{U}
|
|
|
+instead of @samp{L}. @samp{u} specifies a 16-bit Unicode wide
|
|
|
+character constant, and @samp{U} a 32-bit Unicode wide character
|
|
|
+constant. Their types are, respectively, @code{char16_t} and
|
|
|
+@w{@code{char32_t}}; they are declared in the header file
|
|
|
+@file{uchar.h}. These character constants are valid even if
|
|
|
+@file{uchar.h} is not included, but some uses of them may be
|
|
|
+inconvenient without including it to declare those type names.
|
|
|
+
|
|
|
+The character represented in a wide character constant can be an
|
|
|
+ordinary ASCII character. @code{L'a'}, @code{u'a'} and @code{U'a'}
|
|
|
+are all valid, and they are all equal to @code{'a'}.
|
|
|
+
|
|
|
+In all three kinds of wide character constants, you can write a
|
|
|
+non-ASCII Unicode character in the constant itself; the constant's
|
|
|
+value is the character's Unicode character code. Or you can specify
|
|
|
+the Unicode character with an escape sequence (@pxref{Unicode
|
|
|
+Character Codes}).
|
|
|
+
|
|
|
+@node Wide String Constants
|
|
|
+@section Wide String Constants
|
|
|
+@cindex wide string constants
|
|
|
+@cindex constants, wide string
|
|
|
+
|
|
|
+A @dfn{wide string constant} stands for an array of 16-bit or 32-bit
|
|
|
+characters. They are rarely used; if you're just
|
|
|
+learning C, you may as well skip this section.
|
|
|
+
|
|
|
+There are three kinds of wide string constants, which differ in the
|
|
|
+data type used for each character in the string. Each wide string
|
|
|
+constant is equivalent to an array of integers, but the data type of
|
|
|
+those integers depends on the kind of wide string. Using the constant
|
|
|
+in an expression will convert the array to a pointer to its first
|
|
|
+element, as usual for arrays in C (@pxref{Accessing Array Elements}).
|
|
|
+For each kind of wide string constant, we state here what type that
|
|
|
+pointer will be.
|
|
|
+
|
|
|
+@table @code
|
|
|
+@item char16_t
|
|
|
+This is a 16-bit Unicode wide string constant: each element is a
|
|
|
+16-bit Unicode character code with type @code{char16_t}, so the string
|
|
|
+has the pointer type @code{char16_t@ *}. (That is a type designator;
|
|
|
+@pxref{Pointer Type Designators}.) The constant is written as
|
|
|
+@samp{u} (which must be lower case) followed (with no intervening
|
|
|
+space) by a string constant with the usual syntax.
|
|
|
+
|
|
|
+@item char32_t
|
|
|
+This is a 32-bit Unicode wide string constant: each element is a
|
|
|
+32-bit Unicode character code, and the string has type @code{char32_t@ *}.
|
|
|
+It's written as @samp{U} (which must be upper case) followed (with no
|
|
|
+intervening space) by a string constant with the usual syntax.
|
|
|
+
|
|
|
+@item wchar_t
|
|
|
+This is the original kind of wide string constant. It's written as
|
|
|
+@samp{L} (which must be upper case) followed (with no intervening
|
|
|
+space) by a string constant with the usual syntax, and the string has
|
|
|
+type @code{wchar_t@ *}.
|
|
|
+
|
|
|
+The width of the data type @code{wchar_t} depends on the target
|
|
|
+platform, which makes this kind of wide string somewhat less useful
|
|
|
+than the newer kinds.
|
|
|
+@end table
|
|
|
+
|
|
|
+@code{char16_t} and @code{char32_t} are declared in the header file
|
|
|
+@file{uchar.h}. @code{wchar_t} is declared in @file{stddef.h}.
|
|
|
+
|
|
|
+Consecutive wide string constants of the same kind concatenate, just
|
|
|
+like ordinary string constants. A wide string constant concatenated
|
|
|
+with an ordinary string constant results in a wide string constant.
|
|
|
+You can't concatenate two wide string constants of different kinds.
|
|
|
+You also can't concatenate a wide string constant (of any kind) with a
|
|
|
+UTF-8 string constant.
|
|
|
+
|
|
|
+@node Type Size
|
|
|
+@chapter Type Size
|
|
|
+@cindex type size
|
|
|
+@cindex size of type
|
|
|
+@findex sizeof
|
|
|
+
|
|
|
+Each data type has a @dfn{size}, which is the number of bytes
|
|
|
+(@pxref{Storage}) that it occupies in memory. To refer to the size in
|
|
|
+a C program, use @code{sizeof}. There are two ways to use it:
|
|
|
+
|
|
|
+@table @code
|
|
|
+@item sizeof @var{expression}
|
|
|
+This gives the size of @var{expression}, based on its data type. It
|
|
|
+does not calculate the value of @var{expression}, only its size, so if
|
|
|
+@var{expression} includes side effects or function calls, they do not
|
|
|
+happen. Therefore, @code{sizeof} is always a compile-time operation
|
|
|
+that has zero run-time cost.
|
|
|
+
|
|
|
+A value that is a bit field (@pxref{Bit Fields}) is not allowed as an
|
|
|
+operand of @code{sizeof}.
|
|
|
+
|
|
|
+For example,
|
|
|
+
|
|
|
+@example
|
|
|
+double a;
|
|
|
+
|
|
|
+i = sizeof a + 10;
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+sets @code{i} to 18 on most computers because @code{a} occupies 8 bytes.
|
|
|
+
|
|
|
+Here's how to determine the number of elements in an array
|
|
|
+@code{array}:
|
|
|
+
|
|
|
+@example
|
|
|
+(sizeof array / sizeof array[0])
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+The expression @code{sizeof array} gives the size of the array, not
|
|
|
+the size of a pointer to an element. However, if @var{expression} is
|
|
|
+a function parameter that was declared as an array, that
|
|
|
+variable really has a pointer type (@pxref{Array Parm Pointer}), so
|
|
|
+the result is the size of that pointer.
|
|
|
+
|
|
|
+@item sizeof (@var{type})
|
|
|
+This gives the size of @var{type}.
|
|
|
+For example,
|
|
|
+
|
|
|
+@example
|
|
|
+i = sizeof (double) + 10;
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+is equivalent to the previous example.
|
|
|
+
|
|
|
+You can't apply @code{sizeof} to an incomplete type (@pxref{Incomplete
|
|
|
+Types}), nor @code{void}. Using it on a function type gives 1 in GNU
|
|
|
+C, which makes adding an integer to a function pointer work as desired
|
|
|
+(@pxref{Pointer Arithmetic}).
|
|
|
+@end table
|
|
|
+
|
|
|
+@strong{Warning}: When you use @code{sizeof} with a type
|
|
|
+instead of an expression, you must write parentheses around the type.
|
|
|
+
|
|
|
+@strong{Warning}: When applying @code{sizeof} to the result of a cast
|
|
|
+(@pxref{Explicit Type Conversion}), you must write parentheses around
|
|
|
+the cast expression to avoid an ambiguity in the grammar of C@.
|
|
|
+Specifically,
|
|
|
+
|
|
|
+@example
|
|
|
+sizeof (int) -x
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+parses as
|
|
|
+
|
|
|
+@example
|
|
|
+(sizeof (int)) - x
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+If what you want is
|
|
|
+
|
|
|
+@example
|
|
|
+sizeof ((int) -x)
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+you must write it that way, with parentheses.
|
|
|
+
|
|
|
+The data type of the value of the @code{sizeof} operator is always one
|
|
|
+of the unsigned integer types; which one of those types depends on the
|
|
|
+machine. The header file @code{stddef.h} defines the typedef name
|
|
|
+@code{size_t} as an alias for this type. @xref{Defining Typedef
|
|
|
+Names}.
|
|
|
+
|
|
|
+@node Pointers
|
|
|
+@chapter Pointers
|
|
|
+@cindex pointers
|
|
|
+
|
|
|
+Among high-level languages, C is rather low level, close to the
|
|
|
+machine. This is mainly because it has explicit @dfn{pointers}. A
|
|
|
+pointer value is the numeric address of data in memory. The type of
|
|
|
+data to be found at that address is specified by the data type of the
|
|
|
+pointer itself. The unary operator @samp{*} gets the data that a
|
|
|
+pointer points to---this is called @dfn{dereferencing the pointer}.
|
|
|
+
|
|
|
+C also allows pointers to functions, but since there are some
|
|
|
+differences in how they work, we treat them later. @xref{Function
|
|
|
+Pointers}.
|
|
|
+
|
|
|
+@menu
|
|
|
+* Address of Data:: Using the ``address-of'' operator.
|
|
|
+* Pointer Types:: For each type, there is a pointer type.
|
|
|
+* Pointer Declarations:: Declaring variables with pointer types.
|
|
|
+* Pointer Type Designators:: Designators for pointer types.
|
|
|
+* Pointer Dereference:: Accessing what a pointer points at.
|
|
|
+* Null Pointers:: Pointers which do not point to any object.
|
|
|
+* Invalid Dereference:: Dereferencing null or invalid pointers.
|
|
|
+* Void Pointers:: Totally generic pointers, can cast to any.
|
|
|
+* Pointer Comparison:: Comparing memory address values.
|
|
|
+* Pointer Arithmetic:: Computing memory address values.
|
|
|
+* Pointers and Arrays:: Using pointer syntax instead of array syntax.
|
|
|
+* Pointer Arithmetic Low Level:: More about computing memory address values.
|
|
|
+* Pointer Increment/Decrement:: Incrementing and decrementing pointers.
|
|
|
+* Pointer Arithmetic Drawbacks:: A common pointer bug to watch out for.
|
|
|
+* Pointer-Integer Conversion:: Converting pointer types to integer types.
|
|
|
+* Printing Pointers:: Using @code{printf} for a pointer's value.
|
|
|
+@end menu
|
|
|
+
|
|
|
+@node Address of Data
|
|
|
+@section Address of Data
|
|
|
+
|
|
|
+@cindex address-of operator
|
|
|
+The most basic way to make a pointer is with the ``address-of''
|
|
|
+operator, @samp{&}. Let's suppose we have these variables available:
|
|
|
+
|
|
|
+@example
|
|
|
+int i;
|
|
|
+double a[5];
|
|
|
+@end example
|
|
|
+
|
|
|
+Now, @code{&i} gives the address of the variable @code{i}---a pointer
|
|
|
+value that points to @code{i}'s location---and @code{&a[3]} gives the
|
|
|
+address of the element 3 of @code{a}. (It is actually the fourth
|
|
|
+element in the array, since the first element has index 0.)
|
|
|
+
|
|
|
+The address-of operator is unusual because it operates on a place to
|
|
|
+store a value (an lvalue, @pxref{Lvalues}), not on the value currently
|
|
|
+stored there. (The left argument of a simple assignment is unusual in
|
|
|
+the same way.) You can use it on any lvalue except a bit field
|
|
|
+(@pxref{Bit Fields}) or a constructor (@pxref{Structure
|
|
|
+Constructors}).
|
|
|
+
|
|
|
+
|
|
|
+@node Pointer Types
|
|
|
+@section Pointer Types
|
|
|
+
|
|
|
+For each data type @var{t}, there is a type for pointers to type
|
|
|
+@var{t}. For these variables,
|
|
|
+
|
|
|
+@example
|
|
|
+int i;
|
|
|
+double a[5];
|
|
|
+@end example
|
|
|
+
|
|
|
+@itemize @bullet
|
|
|
+@item
|
|
|
+@code{i} has type @code{int}; we say
|
|
|
+@code{&i} is a ``pointer to @code{int}.''
|
|
|
+
|
|
|
+@item
|
|
|
+@code{a} has type @code{double[5]}; we say @code{&a} is a ``pointer to
|
|
|
+arrays of five @code{double}s.''
|
|
|
+
|
|
|
+@item
|
|
|
+@code{a[3]} has type @code{double}; we say @code{&a[3]} is a ``pointer
|
|
|
+to @code{double}.''
|
|
|
+@end itemize
|
|
|
+
|
|
|
+@node Pointer Declarations
|
|
|
+@section Pointer-Variable Declarations
|
|
|
+
|
|
|
+The way to declare that a variable @code{foo} points to type @var{t} is
|
|
|
+
|
|
|
+@example
|
|
|
+@var{t} *foo;
|
|
|
+@end example
|
|
|
+
|
|
|
+To remember this syntax, think ``if you dereference @code{foo}, using
|
|
|
+the @samp{*} operator, what you get is type @var{t}. Thus, @code{foo}
|
|
|
+points to type @var{t}.''
|
|
|
+
|
|
|
+Thus, we can declare variables that hold pointers to these three
|
|
|
+types, like this:
|
|
|
+
|
|
|
+@example
|
|
|
+int *ptri; /* @r{Pointer to @code{int}.} */
|
|
|
+double *ptrd; /* @r{Pointer to @code{double}.} */
|
|
|
+double (*ptrda)[5]; /* @r{Pointer to @code{double[5]}.} */
|
|
|
+@end example
|
|
|
+
|
|
|
+@samp{int *ptri;} means, ``if you dereference @code{ptri}, you get an
|
|
|
+@code{int}.'' @samp{double (*ptrda)[5];} means, ``if you dereference
|
|
|
+@code{ptrda}, then subscript it by an integer less than 5, you get a
|
|
|
+@code{double}.'' The parentheses express the point that you would
|
|
|
+dereference it first, then subscript it.
|
|
|
+
|
|
|
+Contrast the last one with this:
|
|
|
+
|
|
|
+@example
|
|
|
+double *aptrd[5]; /* @r{Array of five pointers to @code{double}.} */
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+Because @samp{*} has higher syntactic precedence than subscripting,
|
|
|
+you would subscript @code{aptrd} then dereference it. Therefore, it
|
|
|
+declares an array of pointers, not a pointer.
|
|
|
+
|
|
|
+@node Pointer Type Designators
|
|
|
+@section Pointer-Type Designators
|
|
|
+
|
|
|
+Every type in C has a designator; you make it by deleting the variable
|
|
|
+name and the semicolon from a declaration (@pxref{Type
|
|
|
+Designators}). Here are the designators for the pointer
|
|
|
+types of the example declarations in the previous section:
|
|
|
+
|
|
|
+@example
|
|
|
+int * /* @r{Pointer to @code{int}.} */
|
|
|
+double * /* @r{Pointer to @code{double}.} */
|
|
|
+double (*)[5] /* @r{Pointer to @code{double[5]}.} */
|
|
|
+@end example
|
|
|
+
|
|
|
+Remember, to understand what type a designator stands for, imagine the
|
|
|
+variable name that would be in the declaration, and figure out what
|
|
|
+type it would declare that variable with. @code{double (*)[5]} can
|
|
|
+only come from @code{double (*@var{variable})[5]}, so it's a pointer
|
|
|
+which, when dereferenced, gives an array of 5 @code{double}s.
|
|
|
+
|
|
|
+@node Pointer Dereference
|
|
|
+@section Dereferencing Pointers
|
|
|
+@cindex dereferencing pointers
|
|
|
+@cindex pointer dereferencing
|
|
|
+
|
|
|
+The main use of a pointer value is to @dfn{dereference it} (access the
|
|
|
+data it points at) with the unary @samp{*} operator. For instance,
|
|
|
+@code{*&i} is the value at @code{i}'s address---which is just
|
|
|
+@code{i}. The two expressions are equivalent, provided @code{&i} is
|
|
|
+valid.
|
|
|
+
|
|
|
+A pointer-dereference expression whose type is data (not a function)
|
|
|
+is an lvalue.
|
|
|
+
|
|
|
+Pointers become really useful when we store them somewhere and use
|
|
|
+them later. Here's a simple example to illustrate the practice:
|
|
|
+
|
|
|
+@example
|
|
|
+@{
|
|
|
+ int i;
|
|
|
+ int *ptr;
|
|
|
+
|
|
|
+ ptr = &i;
|
|
|
+
|
|
|
+ i = 5;
|
|
|
+
|
|
|
+ @r{@dots{}}
|
|
|
+
|
|
|
+ return *ptr; /* @r{Returns 5, fetched from @code{i}.} */
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+This shows how to declare the variable @code{ptr} as type
|
|
|
+@code{int *} (pointer to @code{int}), store a pointer value into it
|
|
|
+(pointing at @code{i}), and use it later to get the value of the
|
|
|
+object it points at (the value in @code{i}).
|
|
|
+
|
|
|
+If anyone can provide a useful example which is this basic,
|
|
|
+I would be grateful.
|
|
|
+
|
|
|
+@node Null Pointers
|
|
|
+@section Null Pointers
|
|
|
+@cindex null pointers
|
|
|
+@cindex pointers, null
|
|
|
+
|
|
|
+@c ???stdio loads sttddef
|
|
|
+
|
|
|
+A pointer value can be @dfn{null}, which means it does not point to
|
|
|
+any object. The cleanest way to get a null pointer is by writing
|
|
|
+@code{NULL}, a standard macro defined in @file{stddef.h}. You can
|
|
|
+also do it by casting 0 to the desired pointer type, as in
|
|
|
+@code{(char *) 0}. (The cast operator performs explicit type conversion;
|
|
|
+@xref{Explicit Type Conversion}.)
|
|
|
+
|
|
|
+You can store a null pointer in any lvalue whose data type
|
|
|
+is a pointer type:
|
|
|
+
|
|
|
+@example
|
|
|
+char *foo;
|
|
|
+foo = NULL;
|
|
|
+@end example
|
|
|
+
|
|
|
+These two, if consecutive, can be combined into a declaration with
|
|
|
+initializer,
|
|
|
+
|
|
|
+@example
|
|
|
+char *foo = NULL;
|
|
|
+@end example
|
|
|
+
|
|
|
+You can also explicitly cast @code{NULL} to the specific pointer type
|
|
|
+you want---it makes no difference.
|
|
|
+
|
|
|
+@example
|
|
|
+char *foo;
|
|
|
+foo = (char *) NULL;
|
|
|
+@end example
|
|
|
+
|
|
|
+To test whether a pointer is null, compare it with zero or
|
|
|
+@code{NULL}, as shown here:
|
|
|
+
|
|
|
+@example
|
|
|
+if (p != NULL)
|
|
|
+ /* @r{@code{p} is not null.} */
|
|
|
+ operate (p);
|
|
|
+@end example
|
|
|
+
|
|
|
+Since testing a pointer for not being null is basic and frequent, all
|
|
|
+but beginners in C will understand the conditional without need for
|
|
|
+@code{!= NULL}:
|
|
|
+
|
|
|
+@example
|
|
|
+if (p)
|
|
|
+ /* @r{@code{p} is not null.} */
|
|
|
+ operate (p);
|
|
|
+@end example
|
|
|
+
|
|
|
+@node Invalid Dereference
|
|
|
+@section Dereferencing Null or Invalid Pointers
|
|
|
+
|
|
|
+Trying to dereference a null pointer is an error. On most platforms,
|
|
|
+it generally causes a signal, usually @code{SIGSEGV}
|
|
|
+(@pxref{Signals}).
|
|
|
+
|
|
|
+@example
|
|
|
+char *foo = NULL;
|
|
|
+c = *foo; /* @r{This causes a signal and terminates.} */
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+Likewise a pointer that has the wrong alignment for the target data type
|
|
|
+(on most types of computer), or points to a part of memory that has
|
|
|
+not been allocated in the process's address space.
|
|
|
+
|
|
|
+The signal terminates the program, unless the program has arranged to
|
|
|
+handle the signal (@pxref{Signal Handling, The GNU C Library, , libc,
|
|
|
+The GNU C Library Reference Manual}).
|
|
|
+
|
|
|
+However, the signal might not happen if the dereference is optimized
|
|
|
+away. In the example above, if you don't subsequently use the value
|
|
|
+of @code{c}, GCC might optimize away the code for @code{*foo}. You
|
|
|
+can prevent such optimization using the @code{volatile} qualifier, as
|
|
|
+shown here:
|
|
|
+
|
|
|
+@example
|
|
|
+volatile char *p;
|
|
|
+volatile char c;
|
|
|
+c = *p;
|
|
|
+@end example
|
|
|
+
|
|
|
+You can use this to test whether @code{p} points to unallocated
|
|
|
+memory. Set up a signal handler first, so the signal won't terminate
|
|
|
+the program.
|
|
|
+
|
|
|
+@node Void Pointers
|
|
|
+@section Void Pointers
|
|
|
+@cindex void pointers
|
|
|
+@cindex pointers, void
|
|
|
+
|
|
|
+The peculiar type @code{void *}, a pointer whose target type is
|
|
|
+@code{void}, is used often in C@. It represents a pointer to
|
|
|
+we-don't-say-what. Thus,
|
|
|
+
|
|
|
+@example
|
|
|
+void *numbered_slot_pointer (int);
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+declares a function @code{numbered_slot_pointer} that takes an
|
|
|
+integer parameter and returns a pointer, but we don't say what type of
|
|
|
+data it points to.
|
|
|
+
|
|
|
+With type @code{void *}, you can pass the pointer around and test
|
|
|
+whether it is null. However, dereferencing it gives a @code{void}
|
|
|
+value that can't be used (@pxref{The Void Type}). To dereference the
|
|
|
+pointer, first convert it to some other pointer type.
|
|
|
+
|
|
|
+Assignments convert @code{void *} automatically to any other pointer
|
|
|
+type, if the left operand has a pointer type; for instance,
|
|
|
+
|
|
|
+@example
|
|
|
+@{
|
|
|
+ int *p;
|
|
|
+ /* @r{Converts return value to @code{int *}.} */
|
|
|
+ p = numbered_slot_pointer (5);
|
|
|
+ @r{@dots{}}
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+Passing an argument of type @code{void *} for a parameter that has a
|
|
|
+pointer type also converts. For example, supposing the function
|
|
|
+@code{hack} is declared to require type @code{float *} for its
|
|
|
+argument, this will convert the null pointer to that type.
|
|
|
+
|
|
|
+@example
|
|
|
+/* @r{Declare @code{hack} that way.}
|
|
|
+ @r{We assume it is defined somewhere else.} */
|
|
|
+void hack (float *);
|
|
|
+@dots{}
|
|
|
+/* @r{Now call @code{hack}.} */
|
|
|
+@{
|
|
|
+ /* @r{Converts return value of @code{numbered_slot_pointer}}
|
|
|
+ @r{to @code{float *} to pass it to @code{hack}.} */
|
|
|
+ hack (numbered_slot_pointer (5));
|
|
|
+ @r{@dots{}}
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+ You can also convert to another pointer type with an explicit cast
|
|
|
+(@pxref{Explicit Type Conversion}), like this:
|
|
|
+@example
|
|
|
+(int *) numbered_slot_pointer (5)
|
|
|
+@end example
|
|
|
+
|
|
|
+Here is an example which decides at run time which pointer
|
|
|
+type to convert to:
|
|
|
+
|
|
|
+@example
|
|
|
+void
|
|
|
+extract_int_or_double (void *ptr, bool its_an_int)
|
|
|
+@{
|
|
|
+ if (its_an_int)
|
|
|
+ handle_an_int (*(int *)ptr);
|
|
|
+ else
|
|
|
+ handle_a_double (*(double *)ptr);
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+The expression @code{*(int *)ptr} means to convert @code{ptr}
|
|
|
+to type @code{int *}, then dereference it.
|
|
|
+
|
|
|
+@node Pointer Comparison
|
|
|
+@section Pointer Comparison
|
|
|
+@cindex pointer comparison
|
|
|
+@cindex comparison, pointer
|
|
|
+
|
|
|
+Two pointer values are equal if they point to the same location, or if
|
|
|
+they are both null. You can test for this with @code{==} and
|
|
|
+@code{!=}. Here's a trivial example:
|
|
|
+
|
|
|
+@example
|
|
|
+@{
|
|
|
+ int i;
|
|
|
+ int *p, *q;
|
|
|
+
|
|
|
+ p = &i;
|
|
|
+ q = &i;
|
|
|
+ if (p == q)
|
|
|
+ printf ("This will be printed.\n");
|
|
|
+ if (p != q)
|
|
|
+ printf ("This won't be printed.\n");
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+Ordering comparisons such as @code{>} and @code{>=} operate on
|
|
|
+pointers by converting them to unsigned integers. The C standard says
|
|
|
+the two pointers must point within the same object in memory, but on
|
|
|
+GNU/Linux systems these operations simply compare the numeric values
|
|
|
+of the pointers.
|
|
|
+
|
|
|
+The pointer values to be compared should in principle have the same type, but
|
|
|
+they are allowed to differ in limited cases. First of all, if the two
|
|
|
+pointers' target types are nearly compatible (@pxref{Compatible
|
|
|
+Types}), the comparison is allowed.
|
|
|
+
|
|
|
+If one of the operands is @code{void *} (@pxref{Void Pointers}) and
|
|
|
+the other is another pointer type, the comparison operator converts
|
|
|
+the @code{void *} pointer to the other type so as to compare them.
|
|
|
+(In standard C, this is not allowed if the other type is a function
|
|
|
+pointer type, but that works in GNU C@.)
|
|
|
+
|
|
|
+Comparison operators also allow comparing the integer 0 with a pointer
|
|
|
+value. Thus works by converting 0 to a null pointer of the same type
|
|
|
+as the other operand.
|
|
|
+
|
|
|
+@node Pointer Arithmetic
|
|
|
+@section Pointer Arithmetic
|
|
|
+@cindex pointer arithmetic
|
|
|
+@cindex arithmetic, pointer
|
|
|
+
|
|
|
+Adding an integer (positive or negative) to a pointer is valid in C@.
|
|
|
+It assumes that the pointer points to an element in an array, and
|
|
|
+advances or retracts the pointer across as many array elements as the
|
|
|
+integer specifies. Here is an example, in which adding a positive
|
|
|
+integer advances the pointer to a later element in the same array.
|
|
|
+
|
|
|
+@example
|
|
|
+void
|
|
|
+incrementing_pointers ()
|
|
|
+@{
|
|
|
+ int array[5] = @{ 45, 29, 104, -3, 123456 @};
|
|
|
+ int elt0, elt1, elt4;
|
|
|
+
|
|
|
+ int *p = &array[0];
|
|
|
+ /* @r{Now @code{p} points at element 0. Fetch it.} */
|
|
|
+ elt0 = *p;
|
|
|
+
|
|
|
+ ++p;
|
|
|
+ /* @r{Now @code{p} points at element 1. Fetch it.} */
|
|
|
+ elt1 = *p;
|
|
|
+
|
|
|
+ p += 3;
|
|
|
+ /* @r{Now @code{p} points at element 4 (the last). Fetch it.} */
|
|
|
+ elt4 = *p;
|
|
|
+
|
|
|
+ printf ("elt0 %d elt1 %d elt4 %d.\n",
|
|
|
+ elt0, elt1, elt4);
|
|
|
+ /* @r{Prints elt0 45 elt1 29 elt4 123456.} */
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+Here's an example where adding a negative integer retracts the pointer
|
|
|
+to an earlier element in the same array.
|
|
|
+
|
|
|
+@example
|
|
|
+void
|
|
|
+decrementing_pointers ()
|
|
|
+@{
|
|
|
+ int array[5] = @{ 45, 29, 104, -3, 123456 @};
|
|
|
+ int elt0, elt3, elt4;
|
|
|
+
|
|
|
+ int *p = &array[4];
|
|
|
+ /* @r{Now @code{p} points at element 4 (the last). Fetch it.} */
|
|
|
+ elt4 = *p;
|
|
|
+
|
|
|
+ --p;
|
|
|
+ /* @r{Now @code{p} points at element 3. Fetch it.} */
|
|
|
+ elt3 = *p;
|
|
|
+
|
|
|
+ p -= 3;
|
|
|
+ /* @r{Now @code{p} points at element 0. Fetch it.} */
|
|
|
+ elt0 = *p;
|
|
|
+
|
|
|
+ printf ("elt0 %d elt3 %d elt4 %d.\n",
|
|
|
+ elt0, elt3, elt4);
|
|
|
+ /* @r{Prints elt0 45 elt3 -3 elt4 123456.} */
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+If one pointer value was made by adding an integer to another
|
|
|
+pointer value, it should be possible to subtract the pointer values
|
|
|
+and recover that integer. That works too in C@.
|
|
|
+
|
|
|
+@example
|
|
|
+void
|
|
|
+subtract_pointers ()
|
|
|
+@{
|
|
|
+ int array[5] = @{ 45, 29, 104, -3, 123456 @};
|
|
|
+ int *p0, *p3, *p4;
|
|
|
+
|
|
|
+ int *p = &array[4];
|
|
|
+ /* @r{Now @code{p} points at element 4 (the last). Save the value.} */
|
|
|
+ p4 = p;
|
|
|
+
|
|
|
+ --p;
|
|
|
+ /* @r{Now @code{p} points at element 3. Save the value.} */
|
|
|
+ p3 = p;
|
|
|
+
|
|
|
+ p -= 3;
|
|
|
+ /* @r{Now @code{p} points at element 0. Save the value.} */
|
|
|
+ p0 = p;
|
|
|
+
|
|
|
+ printf ("%d, %d, %d, %d\n",
|
|
|
+ p4 - p0, p0 - p0, p3 - p0, p0 - p3);
|
|
|
+ /* @r{Prints 4, 0, 3, -3.} */
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+The addition operation does not know where arrays are. All it does is
|
|
|
+add the integer (multiplied by object size) to the value of the
|
|
|
+pointer. When the initial pointer and the result point into a single
|
|
|
+array, the result is well-defined.
|
|
|
+
|
|
|
+@strong{Warning:} Only experts should do pointer arithmetic involving pointers
|
|
|
+into different memory objects.
|
|
|
+
|
|
|
+The difference between two pointers has type @code{int}, or
|
|
|
+@code{long} if necessary (@pxref{Integer Types}). The clean way to
|
|
|
+declare it is to use the typedef name @code{ptrdiff_t} defined in the
|
|
|
+file @file{stddef.h}.
|
|
|
+
|
|
|
+This definition of pointer subtraction is consistent with
|
|
|
+pointer-integer addition, in that @code{(p3 - p1) + p1} equals
|
|
|
+@code{p3}, as in ordinary algebra.
|
|
|
+
|
|
|
+In standard C, addition and subtraction are not allowed on @code{void
|
|
|
+*}, since the target type's size is not defined in that case.
|
|
|
+Likewise, they are not allowed on pointers to function types.
|
|
|
+However, these operations work in GNU C, and the ``size of the target
|
|
|
+type'' is taken as 1.
|
|
|
+
|
|
|
+@node Pointers and Arrays
|
|
|
+@section Pointers and Arrays
|
|
|
+@cindex pointers and arrays
|
|
|
+@cindex arrays and pointers
|
|
|
+
|
|
|
+The clean way to refer to an array element is
|
|
|
+@code{@var{array}[@var{index}]}. Another, complicated way to do the
|
|
|
+same job is to get the address of that element as a pointer, then
|
|
|
+dereference it: @code{* (&@var{array}[0] + @var{index})} (or
|
|
|
+equivalently @code{* (@var{array} + @var{index})}). This first gets a
|
|
|
+pointer to element zero, then increments it with @code{+} to point to
|
|
|
+the desired element, then gets the value from there.
|
|
|
+
|
|
|
+That pointer-arithmetic construct is the @emph{definition} of square
|
|
|
+brackets in C@. @code{@var{a}[@var{b}]} means, by definition,
|
|
|
+@code{*(@var{a} + @var{b})}. This definition uses @var{a} and @var{b}
|
|
|
+symmetrically, so one must be a pointer and the other an integer; it
|
|
|
+does not matter which comes first.
|
|
|
+
|
|
|
+Since indexing with square brackets is defined in terms of addition
|
|
|
+and dereference, that too is symmetrical. Thus, you can write
|
|
|
+@code{3[array]} and it is equivalent to @code{array[3]}. However, it
|
|
|
+would be foolish to write @code{3[array]}, since it has no advantage
|
|
|
+and could confuse people who read the code.
|
|
|
+
|
|
|
+It may seem like a discrepancy that the definition @code{*(@var{a} +
|
|
|
+@var{b})} requires a pointer, but @code{array[3]} uses an array value
|
|
|
+instead. Why is this valid? The name of the array, when used by
|
|
|
+itself as an expression (other than in @code{sizeof}), stands for a
|
|
|
+pointer to the arrays's zeroth element. Thus, @code{array + 3}
|
|
|
+converts @code{array} implicitly to @code{&array[0]}, and the result
|
|
|
+is a pointer to element 3, equivalent to @code{&array[3]}.
|
|
|
+
|
|
|
+Since square brackets are defined in terms of such addition,
|
|
|
+@code{array[3]} first converts @code{array} to a pointer. That's why
|
|
|
+it works to use an array directly in that construct.
|
|
|
+
|
|
|
+@node Pointer Arithmetic Low Level
|
|
|
+@section Pointer Arithmetic at Low Level
|
|
|
+@cindex pointer arithmetic, low level
|
|
|
+@cindex low level pointer arithmetic
|
|
|
+
|
|
|
+The behavior of pointer arithmetic is theoretically defined only when
|
|
|
+the pointer values all point within one object allocated in memory.
|
|
|
+But the addition and subtraction operators can't tell whether the
|
|
|
+pointer values are all within one object. They don't know where
|
|
|
+objects start and end. So what do they really do?
|
|
|
+
|
|
|
+Adding pointer @var{p} to integer @var{i} treats @var{p} as a memory
|
|
|
+address, which is in fact an integer---call it @var{pint}. It treats
|
|
|
+@var{i} as a number of elements of the type that @var{p} points to.
|
|
|
+These elements' sizes add up to @code{@var{i} * sizeof (*@var{p})}.
|
|
|
+So the sum, as an integer, is @code{@var{pint} + @var{i} * sizeof
|
|
|
+(*@var{p})}. This value is reinterpreted as a pointer like @var{p}.
|
|
|
+
|
|
|
+If the starting pointer value @var{p} and the result do not point at
|
|
|
+parts of the same object, the operation is not officially legitimate,
|
|
|
+and C code is not ``supposed'' to do it. But you can do it anyway,
|
|
|
+and it gives precisely the results described by the procedure above.
|
|
|
+In some special situations it can do something useful, but non-wizards
|
|
|
+should avoid it.
|
|
|
+
|
|
|
+Here's a function to offset a pointer value @emph{as if} it pointed to
|
|
|
+an object of any given size, by explicitly performing that calculation:
|
|
|
+
|
|
|
+@example
|
|
|
+#include <stdint.h>
|
|
|
+
|
|
|
+void *
|
|
|
+ptr_add (void *p, int i, int objsize)
|
|
|
+@{
|
|
|
+ intptr_t p_address = (long) p;
|
|
|
+ intptr_t totalsize = i * objsize;
|
|
|
+ intptr_t new_address = p_address + totalsize;
|
|
|
+ return (void *) new_address;
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+@cindex @code{intptr_t}
|
|
|
+This does the same job as @code{@var{p} + @var{i}} with the proper
|
|
|
+pointer type for @var{p}. It uses the type @code{intptr_t}, which is
|
|
|
+defined in the header file @file{stdint.h}. (In practice, @code{long
|
|
|
+long} would always work, but it is cleaner to use @code{intptr_t}.)
|
|
|
+
|
|
|
+@node Pointer Increment/Decrement
|
|
|
+@section Pointer Increment and Decrement
|
|
|
+@cindex pointer increment and decrement
|
|
|
+@cindex incrementing pointers
|
|
|
+@cindex decrementing pointers
|
|
|
+
|
|
|
+The @samp{++} operator adds 1 to a variable. We have seen it for
|
|
|
+integers (@pxref{Increment/Decrement}), but it works for pointers too.
|
|
|
+For instance, suppose we have a series of positive integers,
|
|
|
+terminated by a zero, and we want to add them all up.
|
|
|
+
|
|
|
+@example
|
|
|
+int
|
|
|
+sum_array_till_0 (int *p)
|
|
|
+@{
|
|
|
+ int sum = 0;
|
|
|
+
|
|
|
+ for (;;)
|
|
|
+ @{
|
|
|
+ /* @r{Fetch the next integer.} */
|
|
|
+ int next = *p++;
|
|
|
+ /* @r{Exit the loop if it's 0.} */
|
|
|
+ if (next == 0)
|
|
|
+ break;
|
|
|
+ /* @r{Add it into running total.} */
|
|
|
+ sum += next;
|
|
|
+ @}
|
|
|
+
|
|
|
+ return sum;
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+The statement @samp{break;} will be explained further on (@pxref{break
|
|
|
+Statement}). Used in this way, it immediately exits the surrounding
|
|
|
+@code{for} statement.
|
|
|
+
|
|
|
+@code{*p++} parses as @code{*(p++)}, because a postfix operator always
|
|
|
+takes precedence over a prefix operator. Therefore, it dereferences
|
|
|
+@code{p}, and increments @code{p} afterwards. Incrementing a variable
|
|
|
+means adding 1 to it, as in @code{p = p + 1}. Since @code{p} is a
|
|
|
+pointer, adding 1 to it advances it by the width of the datum it
|
|
|
+points to---in this case, one @code{int}. Therefore, each iteration
|
|
|
+of the loop picks up the next integer from the series and puts it into
|
|
|
+@code{next}.
|
|
|
+
|
|
|
+This @code{for}-loop has no initialization expression since @code{p}
|
|
|
+and @code{sum} are already initialized, it has no end-test since the
|
|
|
+@samp{break;} statement will exit it, and needs no expression to
|
|
|
+advance it since that's done within the loop by incrementing @code{p}
|
|
|
+and @code{sum}. Thus, those three expressions after @code{for} are
|
|
|
+left empty.
|
|
|
+
|
|
|
+Another way to write this function is by keeping the parameter value unchanged
|
|
|
+and using indexing to access the integers in the table.
|
|
|
+
|
|
|
+@example
|
|
|
+int
|
|
|
+sum_array_till_0_indexing (int *p)
|
|
|
+@{
|
|
|
+ int i;
|
|
|
+ int sum = 0;
|
|
|
+
|
|
|
+ for (i = 0; ; i++)
|
|
|
+ @{
|
|
|
+ /* @r{Fetch the next integer.} */
|
|
|
+ int next = p[i];
|
|
|
+ /* @r{Exit the loop if it's 0.} */
|
|
|
+ if (next == 0)
|
|
|
+ break;
|
|
|
+ /* @r{Add it into running total.} */
|
|
|
+ sum += next;
|
|
|
+ @}
|
|
|
+
|
|
|
+ return sum;
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+In this program, instead of advancing @code{p}, we advance @code{i}
|
|
|
+and add it to @code{p}. (Recall that @code{p[i]} means @code{*(p +
|
|
|
+i)}.) Either way, it uses the same address to get the next integer.
|
|
|
+
|
|
|
+It makes no difference in this program whether we write @code{i++} or
|
|
|
+@code{++i}, because the value is not used. All that matters is the
|
|
|
+effect, to increment @code{i}.
|
|
|
+
|
|
|
+The @samp{--} operator also works on pointers; it can be used
|
|
|
+to scan backwards through an array, like this:
|
|
|
+
|
|
|
+@example
|
|
|
+int
|
|
|
+after_last_nonzero (int *p, int len)
|
|
|
+@{
|
|
|
+ /* @r{Set up @code{q} to point just after the last array element.} */
|
|
|
+ int *q = p + len;
|
|
|
+
|
|
|
+ while (q != p)
|
|
|
+ /* @r{Step @code{q} back until it reaches a nonzero element.} */
|
|
|
+ if (*--q != 0)
|
|
|
+ /* @r{Return the index of the element after that nonzero.} */
|
|
|
+ return q - p + 1;
|
|
|
+
|
|
|
+ return 0;
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+That function returns the length of the nonzero part of the
|
|
|
+array specified by its arguments; that is, the index of the
|
|
|
+first zero of the run of zeros at the end.
|
|
|
+
|
|
|
+@node Pointer Arithmetic Drawbacks
|
|
|
+@section Drawbacks of Pointer Arithmetic
|
|
|
+@cindex drawbacks of pointer arithmetic
|
|
|
+@cindex pointer arithmetic, drawbacks
|
|
|
+
|
|
|
+Pointer arithmetic is clean and elegant, but it is also the cause of a
|
|
|
+major security flaw in the C language. Theoretically, it is only
|
|
|
+valid to adjust a pointer within one object allocated as a unit in
|
|
|
+memory. However, if you unintentionally adjust a pointer across the
|
|
|
+bounds of the object and into some other object, the system has no way
|
|
|
+to detect this error.
|
|
|
+
|
|
|
+A bug which does that can easily result in clobbering part of another
|
|
|
+object. For example, with @code{array[-1]} you can read or write the
|
|
|
+nonexistent element before the beginning of an array---probably part
|
|
|
+of some other data.
|
|
|
+
|
|
|
+Combining pointer arithmetic with casts between pointer types, you can
|
|
|
+create a pointer that fails to be properly aligned for its type. For
|
|
|
+example,
|
|
|
+
|
|
|
+@example
|
|
|
+int a[2];
|
|
|
+char *pa = (char *)a;
|
|
|
+int *p = (int *)(pa + 1);
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+gives @code{p} a value pointing to an ``integer'' that includes part
|
|
|
+of @code{a[0]} and part of @code{a[1]}. Dereferencing that with
|
|
|
+@code{*p} can cause a fatal @code{SIGSEGV} signal or it can return the
|
|
|
+contents of that badly aligned @code{int} (@pxref{Signals}. If it
|
|
|
+``works,'' it may be quite slow. It can also cause aliasing
|
|
|
+confusions (@pxref{Aliasing}).
|
|
|
+
|
|
|
+@strong{Warning:} Using improperly aligned pointers is risky---don't do it
|
|
|
+unless it is really necessary.
|
|
|
+
|
|
|
+@node Pointer-Integer Conversion
|
|
|
+@section Pointer-Integer Conversion
|
|
|
+@cindex pointer-integer conversion
|
|
|
+@cindex conversion between pointers and integers
|
|
|
+@cindex @code{uintptr_t}
|
|
|
+
|
|
|
+On modern computers, an address is simply a number. It occupies the
|
|
|
+same space as some size of integer. In C, you can convert a pointer
|
|
|
+to the appropriate integer types and vice versa, without losing
|
|
|
+information. The appropriate integer types are @code{uintptr_t} (an
|
|
|
+unsigned type) and @code{intptr_t} (a signed type). Both are defined
|
|
|
+in @file{stdint.h}.
|
|
|
+
|
|
|
+For instance,
|
|
|
+
|
|
|
+@example
|
|
|
+#include <stdint.h>
|
|
|
+#include <stdio.h>
|
|
|
+
|
|
|
+void
|
|
|
+print_pointer (void *ptr)
|
|
|
+@{
|
|
|
+ uintptr_t converted = (uintptr_t) ptr;
|
|
|
+
|
|
|
+ printf ("Pointer value is 0x%x\n",
|
|
|
+ (unsigned int) converted);
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+The specification @samp{%x} in the template (the first argument) for
|
|
|
+@code{printf} means to represent this argument using hexadecimal
|
|
|
+notation. It's cleaner to use @code{uintptr_t}, since hexadecimal
|
|
|
+printing treats the number as unsigned, but it won't actually matter:
|
|
|
+all @code{printf} gets to see is the series of bits in the number.
|
|
|
+
|
|
|
+@strong{Warning:} Converting pointers to integers is risky---don't do
|
|
|
+it unless it is really necessary.
|
|
|
+
|
|
|
+@node Printing Pointers
|
|
|
+@section Printing Pointers
|
|
|
+
|
|
|
+To print the numeric value of a pointer, use the @samp{%p} specifier.
|
|
|
+For example:
|
|
|
+
|
|
|
+@example
|
|
|
+void
|
|
|
+print_pointer (void *ptr)
|
|
|
+@{
|
|
|
+ printf ("Pointer value is %p\n", ptr);
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+The specification @samp{%p} works with any pointer type. It prints
|
|
|
+@samp{0x} followed by the address in hexadecimal, printed as the
|
|
|
+appropriate unsigned integer type.
|
|
|
+
|
|
|
+@node Structures
|
|
|
+@chapter Structures
|
|
|
+@cindex structures
|
|
|
+@findex struct
|
|
|
+@cindex fields in structures
|
|
|
+
|
|
|
+A @dfn{structure} is a user-defined data type that holds various
|
|
|
+@dfn{fields} of data. Each field has a name and a data type specified
|
|
|
+in the structure's definition.
|
|
|
+
|
|
|
+Here we define a structure suitable for storing a linked list of
|
|
|
+integers. Each list item will hold one integer, plus a pointer
|
|
|
+to the next item.
|
|
|
+
|
|
|
+@example
|
|
|
+struct intlistlink
|
|
|
+ @{
|
|
|
+ int datum;
|
|
|
+ struct intlistlink *next;
|
|
|
+ @};
|
|
|
+@end example
|
|
|
+
|
|
|
+The structure definition has a @dfn{type tag} so that the code can
|
|
|
+refer to this structure. The type tag here is @code{intlistlink}.
|
|
|
+The definition refers recursively to the same structure through that
|
|
|
+tag.
|
|
|
+
|
|
|
+You can define a structure without a type tag, but then you can't
|
|
|
+refer to it again. That is useful only in some special contexts, such
|
|
|
+as inside a @code{typedef} or a @code{union}.
|
|
|
+
|
|
|
+The contents of the structure are specified by the @dfn{field
|
|
|
+declarations} inside the braces. Each field in the structure needs a
|
|
|
+declaration there. The fields in one structure definition must have
|
|
|
+distinct names, but these names do not conflict with any other names
|
|
|
+in the program.
|
|
|
+
|
|
|
+A field declaration looks just like a variable declaration. You can
|
|
|
+combine field declarations with the same beginning, just as you can
|
|
|
+combine variable declarations.
|
|
|
+
|
|
|
+This structure has two fields. One, named @code{datum}, has type
|
|
|
+@code{int} and will hold one integer in the list. The other, named
|
|
|
+@code{next}, is a pointer to another @code{struct intlistlink}
|
|
|
+which would be the rest of the list. In the last list item, it would
|
|
|
+be @code{NULL}.
|
|
|
+
|
|
|
+This structure definition is recursive, since the type of the
|
|
|
+@code{next} field refers to the structure type. Such recursion is not
|
|
|
+a problem; in fact, you can use the type @code{struct intlistlink *}
|
|
|
+before the definition of the type @code{struct intlistlink} itself.
|
|
|
+That works because pointers to all kinds of structures really look the
|
|
|
+same at the machine level.
|
|
|
+
|
|
|
+After defining the structure, you can declare a variable of type
|
|
|
+@code{struct intlistlink} like this:
|
|
|
+
|
|
|
+@example
|
|
|
+struct intlistlink foo;
|
|
|
+@end example
|
|
|
+
|
|
|
+The structure definition itself can serve as the beginning of a
|
|
|
+variable declaration, so you can declare variables immediately after,
|
|
|
+like this:
|
|
|
+
|
|
|
+@example
|
|
|
+struct intlistlink
|
|
|
+ @{
|
|
|
+ int datum;
|
|
|
+ struct intlistlink *next;
|
|
|
+ @} foo;
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+But that is ugly. It is almost always clearer to separate the
|
|
|
+definition of the structure from its uses.
|
|
|
+
|
|
|
+Declaring a structure type inside a block (@pxref{Blocks}) limits
|
|
|
+the scope of the structure type name to that block. That means the
|
|
|
+structure type is recognized only within that block. Declaring it in
|
|
|
+a function parameter list, as here,
|
|
|
+
|
|
|
+@example
|
|
|
+int f (struct foo @{int a, b@} parm);
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+(assuming that @code{struct foo} is not already defined) limits the
|
|
|
+scope of the structure type @code{struct foo} to that parameter list;
|
|
|
+that is basically useless, so it triggers a warning.
|
|
|
+
|
|
|
+Standard C requires at least one field in a structure.
|
|
|
+GNU C does not require this.
|
|
|
+
|
|
|
+@menu
|
|
|
+* Referencing Fields:: Accessing field values in a structure object.
|
|
|
+* Dynamic Memory Allocation:: Allocating space for objects
|
|
|
+ while the program is running.
|
|
|
+* Field Offset:: Memory layout of fields within a structure.
|
|
|
+* Structure Layout:: Planning the memory layout of fields.
|
|
|
+* Packed Structures:: Packing structure fields as close as possible.
|
|
|
+* Bit Fields:: Dividing integer fields
|
|
|
+ into fields with fewer bits.
|
|
|
+* Bit Field Packing:: How bit fields pack together in integers.
|
|
|
+* const Fields:: Making structure fields immutable.
|
|
|
+* Zero Length:: Zero-length array as a variable-length object.
|
|
|
+* Flexible Array Fields:: Another approach to variable-length objects.
|
|
|
+* Overlaying Structures:: Casting one structure type
|
|
|
+ over an object of another structure type.
|
|
|
+* Structure Assignment:: Assigning values to structure objects.
|
|
|
+* Unions:: Viewing the same object in different types.
|
|
|
+* Packing With Unions:: Using a union type to pack various types into
|
|
|
+ the same memory space.
|
|
|
+* Cast to Union:: Casting a value one of the union's alternative
|
|
|
+ types to the type of the union itself.
|
|
|
+* Structure Constructors:: Building new structure objects.
|
|
|
+* Unnamed Types as Fields:: Fields' types do not always need names.
|
|
|
+* Incomplete Types:: Types which have not been fully defined.
|
|
|
+* Intertwined Incomplete Types:: Defining mutually-recursive structue types.
|
|
|
+* Type Tags:: Scope of structure and union type tags.
|
|
|
+@end menu
|
|
|
+
|
|
|
+@node Referencing Fields
|
|
|
+@section Referencing Structure Fields
|
|
|
+@cindex referencing structure fields
|
|
|
+@cindex structure fields, referencing
|
|
|
+
|
|
|
+To make a structure useful, there has to be a way to examine and store
|
|
|
+its fields. The @samp{.} (period) operator does that; its use looks
|
|
|
+like @code{@var{object}.@var{field}}.
|
|
|
+
|
|
|
+Given this structure and variable,
|
|
|
+
|
|
|
+@example
|
|
|
+struct intlistlink
|
|
|
+ @{
|
|
|
+ int datum;
|
|
|
+ struct intlistlink *next;
|
|
|
+ @};
|
|
|
+
|
|
|
+struct intlistlink foo;
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+you can write @code{foo.datum} and @code{foo.next} to refer to the two
|
|
|
+fields in the value of @code{foo}. These fields are lvalues, so you
|
|
|
+can store values into them, and read the values out again.
|
|
|
+
|
|
|
+Most often, structures are dynamically allocated (see the next
|
|
|
+section), and we refer to the objects via pointers.
|
|
|
+@code{(*p).@var{field}} is somewhat cumbersome, so there is an
|
|
|
+abbreviation: @code{p->@var{field}}. For instance, assume the program
|
|
|
+contains this declaration:
|
|
|
+
|
|
|
+@example
|
|
|
+struct intlistlink *ptr;
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+You can write @code{ptr->datum} and @code{ptr->next} to refer
|
|
|
+to the two fields in the object that @code{ptr} points to.
|
|
|
+
|
|
|
+If a unary operator precedes an expression using @samp{->},
|
|
|
+the @samp{->} nests inside:
|
|
|
+
|
|
|
+@example
|
|
|
+ -ptr->datum @r{is equivalent to} -(ptr->datum)
|
|
|
+@end example
|
|
|
+
|
|
|
+You can intermix @samp{->} and @samp{.} without parentheses,
|
|
|
+as shown here:
|
|
|
+
|
|
|
+@example
|
|
|
+struct @{ double d; struct intlistlink l; @} foo;
|
|
|
+
|
|
|
+@r{@dots{}}foo.l.next->next->datum@r{@dots{}}
|
|
|
+@end example
|
|
|
+
|
|
|
+@node Dynamic Memory Allocation
|
|
|
+@section Dynamic Memory Allocation
|
|
|
+@cindex dynamic memory allocation
|
|
|
+@cindex memory allocation, dynamic
|
|
|
+@cindex allocating memory dynamically
|
|
|
+
|
|
|
+To allocate an object dynamically, call the library function
|
|
|
+@code{malloc} (@pxref{Basic Allocation, The GNU C Library,, libc, The GNU C Library
|
|
|
+Reference Manual}). Here is how to allocate an object of type
|
|
|
+@code{struct intlistlink}. To make this code work, include the file
|
|
|
+@file{stdlib.h}, like this:
|
|
|
+
|
|
|
+@example
|
|
|
+#include <stddef.h> /* @r{Defines @code{NULL}.} */
|
|
|
+#include <stdlib.h> /* @r{Declares @code{malloc}.} */
|
|
|
+
|
|
|
+@dots{}
|
|
|
+
|
|
|
+struct intlistlink *
|
|
|
+alloc_intlistlink ()
|
|
|
+@{
|
|
|
+ struct intlistlink *p;
|
|
|
+
|
|
|
+ p = malloc (sizeof (struct intlistlink));
|
|
|
+
|
|
|
+ if (p == NULL)
|
|
|
+ fatal ("Ran out of storage");
|
|
|
+
|
|
|
+ /* @r{Initialize the contents.} */
|
|
|
+ p->datum = 0;
|
|
|
+ p->next = NULL;
|
|
|
+
|
|
|
+ return p;
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+@code{malloc} returns @code{void *}, so the assignment to @code{p}
|
|
|
+will automatically convert it to type @code{struct intlistlink *}.
|
|
|
+The return value of @code{malloc} is always sufficiently aligned
|
|
|
+(@pxref{Type Alignment}) that it is valid for any data type.
|
|
|
+
|
|
|
+The test for @code{p == NULL} is necessary because @code{malloc}
|
|
|
+returns a null pointer if it cannot get any storage. We assume that
|
|
|
+the program defines the function @code{fatal} to report a fatal error
|
|
|
+to the user.
|
|
|
+
|
|
|
+Here's how to add one more integer to the front of such a list:
|
|
|
+
|
|
|
+@example
|
|
|
+struct intlistlink *my_list = NULL;
|
|
|
+
|
|
|
+void
|
|
|
+add_to_mylist (int my_int)
|
|
|
+@{
|
|
|
+ struct intlistlink *p = alloc_intlistlink ();
|
|
|
+
|
|
|
+ p->datum = my_int;
|
|
|
+ p->next = mylist;
|
|
|
+ mylist = p;
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+The way to free the objects is by calling @code{free}. Here's
|
|
|
+a function to free all the links in one of these lists:
|
|
|
+
|
|
|
+@example
|
|
|
+void
|
|
|
+free_intlist (struct intlistlink *p)
|
|
|
+@{
|
|
|
+ while (p)
|
|
|
+ @{
|
|
|
+ struct intlistlink *q = p;
|
|
|
+ p = p->next;
|
|
|
+ free (q);
|
|
|
+ @}
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+We must extract the @code{next} pointer from the object before freeing
|
|
|
+it, because @code{free} can clobber the data that was in the object.
|
|
|
+For the same reason, the program must not use the list any more after
|
|
|
+freeing its elements. To make sure it won't, it is best to clear out
|
|
|
+the variable where the list was stored, like this:
|
|
|
+
|
|
|
+@example
|
|
|
+free_intlist (mylist);
|
|
|
+
|
|
|
+mylist = NULL;
|
|
|
+@end example
|
|
|
+
|
|
|
+@node Field Offset
|
|
|
+@section Field Offset
|
|
|
+@cindex field offset
|
|
|
+@cindex structure field offset
|
|
|
+@cindex offset of structure fields
|
|
|
+
|
|
|
+To determine the offset of a given field @var{field} in a structure
|
|
|
+type @var{type}, use the macro @code{offsetof}, which is defined in
|
|
|
+the file @file{stddef.h}. It is used like this:
|
|
|
+
|
|
|
+@example
|
|
|
+offsetof (@var{type}, @var{field})
|
|
|
+@end example
|
|
|
+
|
|
|
+Here is an example:
|
|
|
+
|
|
|
+@example
|
|
|
+struct foo
|
|
|
+@{
|
|
|
+ int element;
|
|
|
+ struct foo *next;
|
|
|
+@};
|
|
|
+
|
|
|
+offsetof (struct foo, next)
|
|
|
+/* @r{On most machines that is 4. It may be 8.} */
|
|
|
+@end example
|
|
|
+
|
|
|
+@node Structure Layout
|
|
|
+@section Structure Layout
|
|
|
+@cindex structure layout
|
|
|
+@cindex layout of structures
|
|
|
+
|
|
|
+The rest of this chapter covers advanced topics about structures. If
|
|
|
+you are just learning C, you can skip it.
|
|
|
+
|
|
|
+The precise layout of a @code{struct} type is crucial when using it to
|
|
|
+overlay hardware registers, to access data structures in shared
|
|
|
+memory, or to assemble and disassemble packets for network
|
|
|
+communication. It is also important for avoiding memory waste when
|
|
|
+the program makes many objects of that type. However, the layout
|
|
|
+depends on the target platform. Each platform has conventions for
|
|
|
+structure layout, which compilers need to follow.
|
|
|
+
|
|
|
+Here are the conventions used on most platforms.
|
|
|
+
|
|
|
+The structure's fields appear in the structure layout in the order
|
|
|
+they are declared. When possible, consecutive fields occupy
|
|
|
+consecutive bytes within the structure. However, if a field's type
|
|
|
+demands more alignment than it would get that way, C gives it the
|
|
|
+alignment it requires by leaving a gap after the previous field.
|
|
|
+
|
|
|
+Once all the fields have been laid out, it is possible to determine
|
|
|
+the structure's alignment and size. The structure's alignment is the
|
|
|
+maximum alignment of any of the fields in it. Then the structure's
|
|
|
+size is rounded up to a multiple of its alignment. That may require
|
|
|
+leaving a gap at the end of the structure.
|
|
|
+
|
|
|
+Here are some examples, where we assume that @code{char} has size and
|
|
|
+alignment 1 (always true), and @code{int} has size and alignment 4
|
|
|
+(true on most kinds of computers):
|
|
|
+
|
|
|
+@example
|
|
|
+struct foo
|
|
|
+@{
|
|
|
+ char a, b;
|
|
|
+ int c;
|
|
|
+@};
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+This structure occupies 8 bytes, with an alignment of 4. @code{a} is
|
|
|
+at offset 0, @code{b} is at offset 1, and @code{c} is at offset 4.
|
|
|
+There is a gap of 2 bytes before @code{c}.
|
|
|
+
|
|
|
+Contrast that with this structure:
|
|
|
+
|
|
|
+@example
|
|
|
+struct foo
|
|
|
+@{
|
|
|
+ char a;
|
|
|
+ int c;
|
|
|
+ char b;
|
|
|
+@};
|
|
|
+@end example
|
|
|
+
|
|
|
+This structure has size 12 and alignment 4. @code{a} is at offset 0,
|
|
|
+@code{c} is at offset 4, and @code{b} is at offset 8. There are two
|
|
|
+gaps: three bytes before @code{c}, and three bytes at the end.
|
|
|
+
|
|
|
+These two structures have the same contents at the C level, but one
|
|
|
+takes 8 bytes and the other takes 12 bytes due to the ordering of the
|
|
|
+fields. A reliable way to avoid this sort of wastage is to order the
|
|
|
+fields by size, biggest fields first.
|
|
|
+
|
|
|
+@node Packed Structures
|
|
|
+@section Packed Structures
|
|
|
+@cindex packed structures
|
|
|
+@cindex @code{__attribute__((packed))}
|
|
|
+
|
|
|
+In GNU C you can force a structure to be laid out with no gaps by
|
|
|
+adding @code{__attribute__((packed))} after @code{struct} (or at the
|
|
|
+end of the structure type declaration). Here's an example:
|
|
|
+
|
|
|
+@example
|
|
|
+struct __attribute__((packed)) foo
|
|
|
+@{
|
|
|
+ char a;
|
|
|
+ int c;
|
|
|
+ char b;
|
|
|
+@};
|
|
|
+@end example
|
|
|
+
|
|
|
+Without @code{__attribute__((packed))}, this structure occupies 12
|
|
|
+bytes (as described in the previous section), assuming 4-byte
|
|
|
+alignment for @code{int}. With @code{__attribute__((packed))}, it is
|
|
|
+only 6 bytes long---the sum of the lengths of its fields.
|
|
|
+
|
|
|
+Use of @code{__attribute__((packed))} often results in fields that
|
|
|
+don't have the normal alignment for their types. Taking the address
|
|
|
+of such a field can result in an invalid pointer because of its
|
|
|
+improper alignment. Dereferencing such a pointer can cause a
|
|
|
+@code{SIGSEGV} signal on a machine that doesn't, in general, allow
|
|
|
+unaligned pointers.
|
|
|
+
|
|
|
+@xref{Attributes}.
|
|
|
+
|
|
|
+@node Bit Fields
|
|
|
+@section Bit Fields
|
|
|
+@cindex bit fields
|
|
|
+
|
|
|
+A structure field declaration with an integer type can specify the
|
|
|
+number of bits the field should occupy. We call that a @dfn{bit
|
|
|
+field}. These are useful because consecutive bit fields are packed
|
|
|
+into a larger storage unit. For instance,
|
|
|
+
|
|
|
+@example
|
|
|
+unsigned char opcode: 4;
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+specifies that this field takes just 4 bits.
|
|
|
+Since it is unsigned, its possible values range
|
|
|
+from 0 to 15. A signed field with 4 bits, such as this,
|
|
|
+
|
|
|
+@example
|
|
|
+signed char small: 4;
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+can hold values from -8 to 7.
|
|
|
+
|
|
|
+You can subdivide a single byte into those two parts by writing
|
|
|
+
|
|
|
+@example
|
|
|
+unsigned char opcode: 4;
|
|
|
+signed char small: 4;
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+in the structure. With bit fields, these two numbers fit into
|
|
|
+a single @code{char}.
|
|
|
+
|
|
|
+Here's how to declare a one-bit field that can hold either 0 or 1:
|
|
|
+
|
|
|
+@example
|
|
|
+unsigned char special_flag: 1;
|
|
|
+@end example
|
|
|
+
|
|
|
+You can also use the @code{bool} type for bit fields:
|
|
|
+
|
|
|
+@example
|
|
|
+bool special_flag: 1;
|
|
|
+@end example
|
|
|
+
|
|
|
+Except when using @code{bool} (which is always unsigned,
|
|
|
+@pxref{Boolean Type}), always specify @code{signed} or @code{unsigned}
|
|
|
+for a bit field. There is a default, if that's not specified: the bit
|
|
|
+field is signed if plain @code{char} is signed, except that the option
|
|
|
+@option{-funsigned-bitfields} forces unsigned as the default. But it
|
|
|
+is cleaner not to depend on this default.
|
|
|
+
|
|
|
+Bit fields are special in that you cannot take their address with
|
|
|
+@samp{&}. They are not stored with the size and alignment appropriate
|
|
|
+for the specified type, so they cannot be addressed through pointers
|
|
|
+to that type.
|
|
|
+
|
|
|
+@node Bit Field Packing
|
|
|
+@section Bit Field Packing
|
|
|
+
|
|
|
+Programs to communicate with low-level hardware interfaces need to
|
|
|
+define bit fields laid out to match the hardware data. This section
|
|
|
+explains how to do that.
|
|
|
+
|
|
|
+Consecutive bit fields are packed together, but each bit field must
|
|
|
+fit within a single object of its specified type. In this example,
|
|
|
+
|
|
|
+@example
|
|
|
+unsigned short a : 3, b : 3, c : 3, d : 3, e : 3;
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+all five fields fit consecutively into one two-byte @code{short}.
|
|
|
+They need 15 bits, and one @code{short} provides 16. By contrast,
|
|
|
+
|
|
|
+@example
|
|
|
+unsigned char a : 3, b : 3, c : 3, d : 3, e : 3;
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+needs three bytes. It fits @code{a} and @code{b} into one
|
|
|
+@code{char}, but @code{c} won't fit in that @code{char} (they would
|
|
|
+add up to 9 bits). So @code{c} and @code{d} go into a second
|
|
|
+@code{char}, leaving a gap of two bits between @code{b} and @code{c}.
|
|
|
+Then @code{e} needs a third @code{char}. By contrast,
|
|
|
+
|
|
|
+@example
|
|
|
+unsigned char a : 3, b : 3;
|
|
|
+unsigned int c : 3;
|
|
|
+unsigned char d : 3, e : 3;
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+needs only two bytes: the type @code{unsigned int}
|
|
|
+allows @code{c} to straddle bytes that are in the same word.
|
|
|
+
|
|
|
+You can leave a gap of a specified number of bits by defining a
|
|
|
+nameless bit field. This looks like @code{@var{type} : @var{nbits};}.
|
|
|
+It is allocated space in the structure just as a named bit field would
|
|
|
+be allocated.
|
|
|
+
|
|
|
+You can force the following bit field to advance to the following
|
|
|
+aligned memory object with @code{@var{type} : 0;}.
|
|
|
+
|
|
|
+Both of these constructs can syntactically share @var{type} with
|
|
|
+ordinary bit fields. This example illustrates both:
|
|
|
+
|
|
|
+@example
|
|
|
+unsigned int a : 5, : 3, b : 5, : 0, c : 5, : 3, d : 5;
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+It puts @code{a} and @code{b} into one @code{int}, with a 3-bit gap
|
|
|
+between them. Then @code{: 0} advances to the next @code{int},
|
|
|
+so @code{c} and @code{d} fit into that one.
|
|
|
+
|
|
|
+These rules for packing bit fields apply to most target platforms,
|
|
|
+including all the usual real computers. A few embedded controllers
|
|
|
+have special layout rules.
|
|
|
+
|
|
|
+@node const Fields
|
|
|
+@section @code{const} Fields
|
|
|
+@cindex const fields
|
|
|
+@cindex structure fields, constant
|
|
|
+
|
|
|
+@c ??? Is this a C standard feature?
|
|
|
+
|
|
|
+A structure field declared @code{const} cannot be assigned to
|
|
|
+(@pxref{const}). For instance, let's define this modified version of
|
|
|
+@code{struct intlistlink}:
|
|
|
+
|
|
|
+@example
|
|
|
+struct intlistlink_ro /* @r{``ro'' for read-only.} */
|
|
|
+ @{
|
|
|
+ const int datum;
|
|
|
+ struct intlistlink *next;
|
|
|
+ @};
|
|
|
+@end example
|
|
|
+
|
|
|
+This structure can be used to prevent part of the code from modifying
|
|
|
+the @code{datum} field:
|
|
|
+
|
|
|
+@example
|
|
|
+/* @r{@code{p} has type @code{struct intlistlink *}.}
|
|
|
+ @r{Convert it to @code{struct intlistlink_ro *}.} */
|
|
|
+struct intlistlink_ro *q
|
|
|
+ = (struct intlistlink_ro *) p;
|
|
|
+
|
|
|
+q->datum = 5; /* @r{Error!} */
|
|
|
+p->datum = 5; /* @r{Valid since @code{*p} is}
|
|
|
+ @r{not a @code{struct intlistlink_ro}.} */
|
|
|
+@end example
|
|
|
+
|
|
|
+A @code{const} field can get a value in two ways: by initialization of
|
|
|
+the whole structure, and by making a pointer-to-structure point to an object
|
|
|
+in which that field already has a value.
|
|
|
+
|
|
|
+Any @code{const} field in a structure type makes assignment impossible
|
|
|
+for structures of that type (@pxref{Structure Assignment}). That is
|
|
|
+because structure assignment works by assigning the structure's
|
|
|
+fields, one by one.
|
|
|
+
|
|
|
+@node Zero Length
|
|
|
+@section Arrays of Length Zero
|
|
|
+@cindex array of length zero
|
|
|
+@cindex zero-length arrays
|
|
|
+@cindex length-zero arrays
|
|
|
+
|
|
|
+GNU C allows zero-length arrays. They are useful as the last element
|
|
|
+of a structure that is really a header for a variable-length object.
|
|
|
+Here's an example, where we construct a variable-size structure
|
|
|
+to hold a line which is @code{this_length} characters long:
|
|
|
+
|
|
|
+@example
|
|
|
+struct line @{
|
|
|
+ int length;
|
|
|
+ char contents[0];
|
|
|
+@};
|
|
|
+
|
|
|
+struct line *thisline
|
|
|
+ = ((struct line *)
|
|
|
+ malloc (sizeof (struct line)
|
|
|
+ + this_length));
|
|
|
+thisline->length = this_length;
|
|
|
+@end example
|
|
|
+
|
|
|
+In ISO C90, we would have to give @code{contents} a length of 1, which
|
|
|
+means either wasting space or complicating the argument to @code{malloc}.
|
|
|
+
|
|
|
+@node Flexible Array Fields
|
|
|
+@section Flexible Array Fields
|
|
|
+@cindex flexible array fields
|
|
|
+@cindex array fields, flexible
|
|
|
+
|
|
|
+The C99 standard adopted a more complex equivalent of zero-length
|
|
|
+array fields. It's called a @dfn{flexible array}, and it's indicated
|
|
|
+by omitting the length, like this:
|
|
|
+
|
|
|
+@example
|
|
|
+struct line
|
|
|
+@{
|
|
|
+ int length;
|
|
|
+ char contents[];
|
|
|
+@};
|
|
|
+@end example
|
|
|
+
|
|
|
+The flexible array has to be the last field in the structure, and there
|
|
|
+must be other fields before it.
|
|
|
+
|
|
|
+Under the C standard, a structure with a flexible array can't be part
|
|
|
+of another structure, and can't be an element of an array.
|
|
|
+
|
|
|
+GNU C allows static initialization of flexible array fields. The effect
|
|
|
+is to ``make the array long enough'' for the initializer.
|
|
|
+
|
|
|
+@example
|
|
|
+struct f1 @{ int x; int y[]; @} f1
|
|
|
+ = @{ 1, @{ 2, 3, 4 @} @};
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+This defines a structure variable named @code{f1}
|
|
|
+whose type is @code{struct f1}. In C, a variable name or function name
|
|
|
+never conflicts with a structure type tag.
|
|
|
+
|
|
|
+Omitting the flexible array field's size lets the initializer
|
|
|
+determine it. This is allowed only when the flexible array is defined
|
|
|
+in the outermost structure and you declare a variable of that
|
|
|
+structure type. For example:
|
|
|
+
|
|
|
+@example
|
|
|
+struct foo @{ int x; int y[]; @};
|
|
|
+struct bar @{ struct foo z; @};
|
|
|
+
|
|
|
+struct foo a = @{ 1, @{ 2, 3, 4 @} @}; // @r{Valid.}
|
|
|
+struct bar b = @{ @{ 1, @{ 2, 3, 4 @} @} @}; // @r{Invalid.}
|
|
|
+struct bar c = @{ @{ 1, @{ @} @} @}; // @r{Valid.}
|
|
|
+struct foo d[1] = @{ @{ 1 @{ 2, 3, 4 @} @} @}; // @r{Invalid.}
|
|
|
+@end example
|
|
|
+
|
|
|
+@node Overlaying Structures
|
|
|
+@section Overlaying Different Structures
|
|
|
+@cindex overlaying structures
|
|
|
+@cindex structures, overlaying
|
|
|
+
|
|
|
+Be careful about using different structure types to refer to the same
|
|
|
+memory within one function, because GNU C can optimize code assuming
|
|
|
+it never does that. @xref{Aliasing}. Here's an example of the kind of
|
|
|
+aliasing that can cause the problem:
|
|
|
+
|
|
|
+@example
|
|
|
+struct a @{ int size; char *data; @};
|
|
|
+struct b @{ int size; char *data; @};
|
|
|
+struct a foo;
|
|
|
+struct b *q = (struct b *) &foo;
|
|
|
+@end example
|
|
|
+
|
|
|
+Here @code{q} points to the same memory that the variable @code{foo}
|
|
|
+occupies, but they have two different types. The two types
|
|
|
+@code{struct a} and @code{struct b} are defined alike, but they are
|
|
|
+not the same type. Interspersing references using the two types,
|
|
|
+like this,
|
|
|
+
|
|
|
+@example
|
|
|
+p->size = 0;
|
|
|
+q->size = 1;
|
|
|
+x = p->size;
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+allows GNU C to assume that @code{p->size} is still zero when it is
|
|
|
+copied into @code{x}. The compiler ``knows'' that @code{q} points to
|
|
|
+a @code{struct b} and this cannot overlap with a @code{struct a}.
|
|
|
+
|
|
|
+Other compilers might also do this optimization. The ISO C standard
|
|
|
+considers such code erroneous, precisely so that this optimization
|
|
|
+will be valid.
|
|
|
+
|
|
|
+@node Structure Assignment
|
|
|
+@section Structure Assignment
|
|
|
+@cindex structure assignment
|
|
|
+@cindex assigning structures
|
|
|
+
|
|
|
+Assignment operating on a structure type copies the structure. The
|
|
|
+left and right operands must have the same type. Here is an example:
|
|
|
+
|
|
|
+@example
|
|
|
+#include <stddef.h> /* @r{Defines @code{NULL}.} */
|
|
|
+#include <stdlib.h> /* @r{Declares @code{malloc}.} */
|
|
|
+@r{@dots{}}
|
|
|
+
|
|
|
+struct point @{ double x, y; @};
|
|
|
+
|
|
|
+struct point *
|
|
|
+copy_point (struct point point)
|
|
|
+@{
|
|
|
+ struct point *p
|
|
|
+ = (struct point *) malloc (sizeof (struct point));
|
|
|
+ if (p == NULL)
|
|
|
+ fatal ("Out of memory");
|
|
|
+ *p = point;
|
|
|
+ return p;
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+Notionally, assignment on a structure type works by copying each of
|
|
|
+the fields. Thus, if any of the fields has the @code{const}
|
|
|
+qualifier, that structure type does not allow assignment:
|
|
|
+
|
|
|
+@example
|
|
|
+struct point @{ const double x, y; @};
|
|
|
+
|
|
|
+struct point a, b;
|
|
|
+
|
|
|
+a = b; /* @r{Error!} */
|
|
|
+@end example
|
|
|
+
|
|
|
+@xref{Assignment Expressions}.
|
|
|
+
|
|
|
+@node Unions
|
|
|
+@section Unions
|
|
|
+@cindex unions
|
|
|
+@findex union
|
|
|
+
|
|
|
+A @dfn{union type} defines alternative ways of looking at the same
|
|
|
+piece of memory. Each alternative view is defined with a data type,
|
|
|
+and identified by a name. A union definition looks like this:
|
|
|
+
|
|
|
+@example
|
|
|
+union @var{name}
|
|
|
+@{
|
|
|
+ @var{alternative declarations}@r{@dots{}}
|
|
|
+@};
|
|
|
+@end example
|
|
|
+
|
|
|
+Each alternative declaration looks like a structure field declaration,
|
|
|
+except that it can't be a bit field. For instance,
|
|
|
+
|
|
|
+@example
|
|
|
+union number
|
|
|
+@{
|
|
|
+ long int integer;
|
|
|
+ double float;
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+lets you store either an integer (type @code{long int}) or a floating
|
|
|
+point number (type @code{double}) in the same place in memory. The
|
|
|
+length and alignment of the union type are the maximum of all the
|
|
|
+alternatives---they do not have to be the same. In this union
|
|
|
+example, @code{double} probably takes more space than @code{long int},
|
|
|
+but that doesn't cause a problem in programs that use the union in the
|
|
|
+normal way.
|
|
|
+
|
|
|
+The members don't have to be different in data type. Sometimes
|
|
|
+each member pertains to a way the data will be used. For instance,
|
|
|
+
|
|
|
+@example
|
|
|
+union datum
|
|
|
+@{
|
|
|
+ double latitude;
|
|
|
+ double longitude;
|
|
|
+ double height;
|
|
|
+ double weight;
|
|
|
+ int continent;
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+This union holds one of several kinds of data; most kinds are floating
|
|
|
+points, but the value can also be a code for a continent which is an
|
|
|
+integer. You @emph{could} use one member of type @code{double} to
|
|
|
+access all the values which have that type, but the different member
|
|
|
+names will make the program clearer.
|
|
|
+
|
|
|
+The alignment of a union type is the maximum of the alignments of the
|
|
|
+alternatives. The size of the union type is the maximum of the sizes
|
|
|
+of the alternatives, rounded up to a multiple of the alignment
|
|
|
+(because every type's size must be a multiple of its alignment).
|
|
|
+
|
|
|
+All the union alternatives start at the address of the union itself.
|
|
|
+If an alternative is shorter than the union as a whole, it occupies
|
|
|
+the first part of the union's storage, leaving the last part unused
|
|
|
+@emph{for that alternative}.
|
|
|
+
|
|
|
+@strong{Warning:} if the code stores data using one union alternative
|
|
|
+and accesses it with another, the results depend on the kind of
|
|
|
+computer in use. Only wizards should try to do this. However, when
|
|
|
+you need to do this, a union is a clean way to do it.
|
|
|
+
|
|
|
+Assignment works on any union type by copying the entire value.
|
|
|
+
|
|
|
+@node Packing With Unions
|
|
|
+@section Packing With Unions
|
|
|
+
|
|
|
+Sometimes we design a union with the intention of packing various
|
|
|
+kinds of objects into a certain amount of memory space. For example.
|
|
|
+
|
|
|
+@example
|
|
|
+union bytes8
|
|
|
+@{
|
|
|
+ long long big_int_elt;
|
|
|
+ double double_elt;
|
|
|
+ struct @{ int first, second; @} two_ints;
|
|
|
+ struct @{ void *first, *second; @} two_ptrs;
|
|
|
+@};
|
|
|
+
|
|
|
+union bytes8 *p;
|
|
|
+@end example
|
|
|
+
|
|
|
+This union makes it possible to look at 8 bytes of data that @code{p}
|
|
|
+points to as a single 8-byte integer (@code{p->big_int_elt}), as a
|
|
|
+single floating-point number (@code{p->double_elt}), as a pair of
|
|
|
+integers (@code{p->two_ints.first} and @code{p->two_ints.second}), or
|
|
|
+as a pair of pointers (@code{p->two_ptrs.first} and
|
|
|
+@code{p->two_ptrs.second}).
|
|
|
+
|
|
|
+To pack storage with such a union makes assumptions about the sizes of
|
|
|
+all the types involved. This particular union was written expecting a
|
|
|
+pointer to have the same size as @code{int}. On a machine where one
|
|
|
+pointer takes 8 bytes, the code using this union probably won't work
|
|
|
+as expected. The union, as such, will function correctly---if you
|
|
|
+store two values through @code{two_ints} and extract them through
|
|
|
+@code{two_ints}, you will get the same integers back---but the part of
|
|
|
+the program that expects the union to be 8 bytes long could
|
|
|
+malfunction, or at least use too much space.
|
|
|
+
|
|
|
+The above example shows one case where a @code{struct} type with no
|
|
|
+tag can be useful. Another way to get effectively the same result
|
|
|
+is with arrays as members of the union:
|
|
|
+
|
|
|
+@example
|
|
|
+union eight_bytes
|
|
|
+@{
|
|
|
+ long long big_int_elt;
|
|
|
+ double double_elt;
|
|
|
+ int two_ints[2];
|
|
|
+ void *two_ptrs[2];
|
|
|
+@};
|
|
|
+@end example
|
|
|
+
|
|
|
+@node Cast to Union
|
|
|
+@section Cast to a Union Type
|
|
|
+@cindex cast to a union
|
|
|
+@cindex union, casting to a
|
|
|
+
|
|
|
+In GNU C, you can explicitly cast any of the alternative types to the
|
|
|
+union type; for instance,
|
|
|
+
|
|
|
+@example
|
|
|
+(union eight_bytes) (long long) 5
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+makes a value of type @code{union eight_bytes} which gets its contents
|
|
|
+through the alternative named @code{big_int_elt}.
|
|
|
+
|
|
|
+The value being cast must exactly match the type of the alternative,
|
|
|
+so this is not valid:
|
|
|
+
|
|
|
+@example
|
|
|
+(union eight_bytes) 5 /* @r{Error! 5 is @code{int}.} */
|
|
|
+@end example
|
|
|
+
|
|
|
+A cast to union type looks like any other cast, except that the type
|
|
|
+specified is a union type. You can specify the type either with
|
|
|
+@code{union @var{tag}} or with a typedef name (@pxref{Defining
|
|
|
+Typedef Names}).
|
|
|
+
|
|
|
+Using the cast as the right-hand side of an assignment to a variable of
|
|
|
+union type is equivalent to storing in an alternative of the union:
|
|
|
+
|
|
|
+@example
|
|
|
+union foo u;
|
|
|
+
|
|
|
+u = (union foo) x @r{means} u.i = x
|
|
|
+
|
|
|
+u = (union foo) y @r{means} u.d = y
|
|
|
+@end example
|
|
|
+
|
|
|
+You can also use the union cast as a function argument:
|
|
|
+
|
|
|
+@example
|
|
|
+void hack (union foo);
|
|
|
+@r{@dots{}}
|
|
|
+hack ((union foo) x);
|
|
|
+@end example
|
|
|
+
|
|
|
+@node Structure Constructors
|
|
|
+@section Structure Constructors
|
|
|
+@cindex structure constructors
|
|
|
+@cindex constructors, structure
|
|
|
+
|
|
|
+You can construct a structure value by writing its type in
|
|
|
+parentheses, followed by an initializer that would be valid in a
|
|
|
+declaration for that type. For instance, given this declaration,
|
|
|
+
|
|
|
+@example
|
|
|
+struct foo @{int a; char b[2];@} structure;
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+you can create a @code{struct foo} value as follows:
|
|
|
+
|
|
|
+@example
|
|
|
+((struct foo) @{x + y, 'a', 0@})
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+This specifies @code{x + y} for field @code{a},
|
|
|
+the character @samp{a} for field @code{b}'s element 0,
|
|
|
+and the null character for field @code{b}'s element 1.
|
|
|
+
|
|
|
+The parentheses around that constructor are to necessary, but we
|
|
|
+recommend writing them to make the nesting of the containing
|
|
|
+expression clearer.
|
|
|
+
|
|
|
+You can also show the nesting of the two by writing it like
|
|
|
+this:
|
|
|
+
|
|
|
+@example
|
|
|
+((struct foo) @{x + y, @{'a', 0@} @})
|
|
|
+@end example
|
|
|
+
|
|
|
+Each of those is equivalent to writing the following statement
|
|
|
+expression (@pxref{Statement Exprs}):
|
|
|
+
|
|
|
+@example
|
|
|
+(@{
|
|
|
+ struct foo temp = @{x + y, 'a', 0@};
|
|
|
+ temp;
|
|
|
+@})
|
|
|
+@end example
|
|
|
+
|
|
|
+You can also create a union value this way, but it is not especially
|
|
|
+useful since that is equivalent to doing a cast:
|
|
|
+
|
|
|
+@example
|
|
|
+ ((union whosis) @{@var{value}@})
|
|
|
+@r{is equivalent to}
|
|
|
+ ((union whosis) (@var{value}))
|
|
|
+@end example
|
|
|
+
|
|
|
+@node Unnamed Types as Fields
|
|
|
+@section Unnamed Types as Fields
|
|
|
+@cindex unnamed structures
|
|
|
+@cindex unnamed unions
|
|
|
+@cindex structures, unnamed
|
|
|
+@cindex unions, unnamed
|
|
|
+
|
|
|
+A structure or a union can contain, as fields,
|
|
|
+unnamed structures and unions. Here's an example:
|
|
|
+
|
|
|
+@example
|
|
|
+struct
|
|
|
+@{
|
|
|
+ int a;
|
|
|
+ union
|
|
|
+ @{
|
|
|
+ int b;
|
|
|
+ float c;
|
|
|
+ @};
|
|
|
+ int d;
|
|
|
+@} foo;
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+You can access the fields of the unnamed union within @code{foo} as if they
|
|
|
+were individual fields at the same level as the union definition:
|
|
|
+
|
|
|
+@example
|
|
|
+foo.a = 42;
|
|
|
+foo.b = 47;
|
|
|
+foo.c = 5.25; // @r{Overwrites the value in @code{foo.b}}.
|
|
|
+foo.d = 314;
|
|
|
+@end example
|
|
|
+
|
|
|
+Avoid using field names that could cause ambiguity. For example, with
|
|
|
+this definition:
|
|
|
+
|
|
|
+@example
|
|
|
+struct
|
|
|
+@{
|
|
|
+ int a;
|
|
|
+ struct
|
|
|
+ @{
|
|
|
+ int a;
|
|
|
+ float b;
|
|
|
+ @};
|
|
|
+@} foo;
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+it is impossible to tell what @code{foo.a} refers to. GNU C reports
|
|
|
+an error when a definition is ambiguous in this way.
|
|
|
+
|
|
|
+@node Incomplete Types
|
|
|
+@section Incomplete Types
|
|
|
+@cindex incomplete types
|
|
|
+@cindex types, incomplete
|
|
|
+
|
|
|
+A type that has not been fully defined is called an @dfn{incomplete
|
|
|
+type}. Structure and union types are incomplete when the code makes a
|
|
|
+forward reference, such as @code{struct foo}, before defining the
|
|
|
+type. An array type is incomplete when its length is unspecified.
|
|
|
+
|
|
|
+You can't use an incomplete type to declare a variable or field, or
|
|
|
+use it for a function parameter or return type. The operators
|
|
|
+@code{sizeof} and @code{_Alignof} give errors when used on an
|
|
|
+incomplete type.
|
|
|
+
|
|
|
+However, you can define a pointer to an incomplete type, and declare a
|
|
|
+variable or field with such a pointer type. In general, you can do
|
|
|
+everything with such pointers except dereference them. For example:
|
|
|
+
|
|
|
+@example
|
|
|
+extern void bar (struct mysterious_value *);
|
|
|
+
|
|
|
+void
|
|
|
+foo (struct mysterious_value *arg)
|
|
|
+@{
|
|
|
+ bar (arg);
|
|
|
+@}
|
|
|
+
|
|
|
+@r{@dots{}}
|
|
|
+
|
|
|
+@{
|
|
|
+ struct mysterious_value *p, **q;
|
|
|
+
|
|
|
+ p = *q;
|
|
|
+ foo (p);
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+These examples are valid because the code doesn't try to understand
|
|
|
+what @code{p} points to; it just passes the pointer around.
|
|
|
+(Presumably @code{bar} is defined in some other file that really does
|
|
|
+have a definition for @code{struct mysterious_value}.) However,
|
|
|
+dereferencing the pointer would get an error; that requires a
|
|
|
+definition for the structure type.
|
|
|
+
|
|
|
+@node Intertwined Incomplete Types
|
|
|
+@section Intertwined Incomplete Types
|
|
|
+
|
|
|
+When several structure types contain pointers to each other, you can
|
|
|
+define the types in any order because pointers to types that come
|
|
|
+later are incomplete types. Thus,
|
|
|
+Here is an example.
|
|
|
+
|
|
|
+@example
|
|
|
+/* @r{An employee record points to a group.} */
|
|
|
+struct employee
|
|
|
+@{
|
|
|
+ char *name;
|
|
|
+ @r{@dots{}}
|
|
|
+ struct group *group; /* @r{incomplete type.} */
|
|
|
+ @r{@dots{}}
|
|
|
+@};
|
|
|
+
|
|
|
+/* @r{An employee list points to employees.} */
|
|
|
+struct employee_list
|
|
|
+@{
|
|
|
+ struct employee *this_one;
|
|
|
+ struct employee_list *next; /* @r{incomplete type.} */
|
|
|
+ @r{@dots{}}
|
|
|
+@};
|
|
|
+
|
|
|
+/* @r{A group points to one employee_list.} */
|
|
|
+struct group
|
|
|
+@{
|
|
|
+ char *name;
|
|
|
+ @r{@dots{}}
|
|
|
+ struct employee_list *employees;
|
|
|
+ @r{@dots{}}
|
|
|
+@};
|
|
|
+@end example
|
|
|
+
|
|
|
+@node Type Tags
|
|
|
+@section Type Tags
|
|
|
+@cindex type tags
|
|
|
+
|
|
|
+The name that follows @code{struct} (@pxref{Structures}), @code{union}
|
|
|
+(@pxref{Unions}, or @code{enum} (@pxref{Enumeration Types}) is called
|
|
|
+a @dfn{type tag}. In C, a type tag never conflicts with a variable
|
|
|
+name or function name; the type tags have a separate @dfn{name space}.
|
|
|
+Thus, there is no name conflict in this code:
|
|
|
+
|
|
|
+@example
|
|
|
+struct pair @{ int a, b; @};
|
|
|
+int pair = 1;
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+nor in this one:
|
|
|
+
|
|
|
+@example
|
|
|
+struct pair @{ int a, b; @} pair;
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+where @code{pair} is both a structure type tag and a variable name.
|
|
|
+
|
|
|
+However, @code{struct}, @code{union}, and @code{enum} share the same
|
|
|
+name space of tags, so this is a conflict:
|
|
|
+
|
|
|
+@example
|
|
|
+struct pair @{ int a, b; @};
|
|
|
+enum pair @{ c, d @};
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+and so is this:
|
|
|
+
|
|
|
+@example
|
|
|
+struct pair @{ int a, b; @};
|
|
|
+struct pair @{ int c, d; @};
|
|
|
+@end example
|
|
|
+
|
|
|
+When the code defines a type tag inside a block, the tag's scope is
|
|
|
+limited to that block (as for local variables). Two definitions for
|
|
|
+one type tag do not conflict if they are in different scopes; rather,
|
|
|
+each is valid in its scope. For example,
|
|
|
+
|
|
|
+@example
|
|
|
+struct pair @{ int a, b; @};
|
|
|
+
|
|
|
+void
|
|
|
+pair_up_doubles (int len, double array[])
|
|
|
+@{
|
|
|
+ struct pair @{ double a, b; @};
|
|
|
+ @r{@dots{}}
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+has two definitions for @code{struct pair} which do not conflict. The
|
|
|
+one inside the function applies only within the definition of
|
|
|
+@code{pair_up_doubles}. Within its scope, that definition
|
|
|
+@dfn{shadows} the outer definition.
|
|
|
+
|
|
|
+If @code{struct pair} appears inside the function body, before the
|
|
|
+inner definition, it refers to the outer definition---the only one
|
|
|
+that has been seen at that point. Thus, in this code,
|
|
|
+
|
|
|
+@example
|
|
|
+struct pair @{ int a, b; @};
|
|
|
+
|
|
|
+void
|
|
|
+pair_up_doubles (int len, double array[])
|
|
|
+@{
|
|
|
+ struct two_pairs @{ struct pair *p, *q; @};
|
|
|
+ struct pair @{ double a, b; @};
|
|
|
+ @r{@dots{}}
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+the structure @code{two_pairs} has pointers to the outer definition of
|
|
|
+@code{struct pair}, which is probably not desirable.
|
|
|
+
|
|
|
+To prevent that, you can write @code{struct pair;} inside the function
|
|
|
+body as a variable declaration with no variables. This is a
|
|
|
+@dfn{forward declaration} of the type tag @code{pair}: it makes the
|
|
|
+type tag local to the current block, with the details of the type to
|
|
|
+come later. Here's an example:
|
|
|
+
|
|
|
+@example
|
|
|
+void
|
|
|
+pair_up_doubles (int len, double array[])
|
|
|
+@{
|
|
|
+ /* @r{Forward declaration for @code{pair}.} */
|
|
|
+ struct pair;
|
|
|
+ struct two_pairs @{ struct pair *p, *q; @};
|
|
|
+ /* @r{Give the details.} */
|
|
|
+ struct pair @{ double a, b; @};
|
|
|
+ @r{@dots{}}
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+However, the cleanest practice is to avoid shadowing type tags.
|
|
|
+
|
|
|
+@node Arrays
|
|
|
+@chapter Arrays
|
|
|
+@cindex array
|
|
|
+@cindex elements of arrays
|
|
|
+
|
|
|
+An @dfn{array} is a data object that holds a series of @dfn{elements},
|
|
|
+all of the same data type. Each element is identified by its numeric
|
|
|
+@var{index} within the array.
|
|
|
+
|
|
|
+We presented arrays of numbers in the sample programs early in this
|
|
|
+manual (@pxref{Array Example}). However, arrays can have elements of
|
|
|
+any data type, including pointers, structures, unions, and other
|
|
|
+arrays.
|
|
|
+
|
|
|
+If you know another programming language, you may suppose that you know all
|
|
|
+about arrays, but C arrays have special quirks, so in this chapter we
|
|
|
+collect all the information about arrays in C@.
|
|
|
+
|
|
|
+The elements of a C array are allocated consecutively in memory,
|
|
|
+with no gaps between them. Each element is aligned as required
|
|
|
+for its data type (@pxref{Type Alignment}).
|
|
|
+
|
|
|
+@menu
|
|
|
+* Accessing Array Elements:: How to access individual elements of an array.
|
|
|
+* Declaring an Array:: How to name and reserve space for a new array.
|
|
|
+* Strings:: A string in C is a special case of array.
|
|
|
+* Array Type Designators:: Referring to a specific array type.
|
|
|
+* Incomplete Array Types:: Naming, but not allocating, a new array.
|
|
|
+* Limitations of C Arrays:: Arrays are not first-class objects.
|
|
|
+* Multidimensional Arrays:: Arrays of arrays.
|
|
|
+* Constructing Array Values:: Assigning values to an entire array at once.
|
|
|
+* Arrays of Variable Length:: Declaring arrays of non-constant size.
|
|
|
+@end menu
|
|
|
+
|
|
|
+@node Accessing Array Elements
|
|
|
+@section Accessing Array Elements
|
|
|
+@cindex accessing array elements
|
|
|
+@cindex array elements, accessing
|
|
|
+
|
|
|
+If the variable @code{a} is an array, the @var{n}th element of
|
|
|
+@code{a} is @code{a[@var{n}]}. You can use that expression to access
|
|
|
+an element's value or to assign to it:
|
|
|
+
|
|
|
+@example
|
|
|
+x = a[5];
|
|
|
+a[6] = 1;
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+Since the variable @code{a} is an lvalue, @code{a[@var{n}]} is also an
|
|
|
+lvalue.
|
|
|
+
|
|
|
+The lowest valid index in an array is 0, @emph{not} 1, and the highest
|
|
|
+valid index is one less than the number of elements.
|
|
|
+
|
|
|
+The C language does not check whether array indices are in bounds, so
|
|
|
+if the code uses an out-of-range index, it will access memory outside the
|
|
|
+array.
|
|
|
+
|
|
|
+@strong{Warning:} Using only valid index values in C is the
|
|
|
+programmer's responsibility.
|
|
|
+
|
|
|
+Array indexing in C is not a primitive operation: it is defined in
|
|
|
+terms of pointer arithmetic and dereferencing. Now that we know
|
|
|
+@emph{what} @code{a[i]} does, we can ask @emph{how} @code{a[i]} does
|
|
|
+its job.
|
|
|
+
|
|
|
+In C, @code{@var{x}[@var{y}]} is an abbreviation for
|
|
|
+@code{*(@var{x}+@var{y})}. Thus, @code{a[i]} really means
|
|
|
+@code{*(a+i)}. @xref{Pointers and Arrays}.
|
|
|
+
|
|
|
+When an expression with array type (such as @code{a}) appears as part
|
|
|
+of a larger C expression, it is converted automatically to a pointer
|
|
|
+to element zero of that array. For instance, @code{a} in an
|
|
|
+expression is equivalent to @code{&a[0]}. Thus, @code{*(a+i)} is
|
|
|
+computed as @code{*(&a[0]+i)}.
|
|
|
+
|
|
|
+Now we can analyze how that expression gives us the desired element of
|
|
|
+the array. It makes a pointer to element 0 of @code{a}, advances it
|
|
|
+by the value of @code{i}, and dereferences that pointer.
|
|
|
+
|
|
|
+Another equivalent way to write the expression is @code{(&a[0])[i]}.
|
|
|
+
|
|
|
+@node Declaring an Array
|
|
|
+@section Declaring an Array
|
|
|
+@cindex declaring an array
|
|
|
+@cindex array, declaring
|
|
|
+
|
|
|
+To make an array declaration, write @code{[@var{length}]} after the
|
|
|
+name being declared. This construct is valid in the declaration of a
|
|
|
+variable, a function parameter, a function value type (the value can't
|
|
|
+be an array, but it can be a pointer to one), a structure field, or a
|
|
|
+union alternative.
|
|
|
+
|
|
|
+The surrounding declaration specifies the element type of the array;
|
|
|
+that can be any type of data, but not @code{void} or a function type.
|
|
|
+For instance,
|
|
|
+
|
|
|
+@example
|
|
|
+double a[5];
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+declares @code{a} as an array of 5 @code{double}s.
|
|
|
+
|
|
|
+@example
|
|
|
+struct foo bstruct[length];
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+declares @code{bstruct} as an array of @code{length} objects of type
|
|
|
+@code{struct foo}. A variable array size like this is allowed when
|
|
|
+the array is not file-scope.
|
|
|
+
|
|
|
+Other declaration constructs can nest within the array declaration
|
|
|
+construct. For instance:
|
|
|
+
|
|
|
+@example
|
|
|
+struct foo *b[length];
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+declares @code{b} as an array of @code{length} pointers to
|
|
|
+@code{struct foo}. This shows that the length need not be a constant
|
|
|
+(@pxref{Arrays of Variable Length}).
|
|
|
+
|
|
|
+@example
|
|
|
+double (*c)[5];
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+declares @code{c} as a pointer to an array of 5 @code{double}s, and
|
|
|
+
|
|
|
+@example
|
|
|
+char *(*f (int))[5];
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+declares @code{f} as a function taking an @code{int} argument and
|
|
|
+returning a pointer to an array of 5 strings (pointers to
|
|
|
+@code{char}s).
|
|
|
+
|
|
|
+@example
|
|
|
+double aa[5][10];
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+declares @code{aa} as an array of 5 elements, each of which is an
|
|
|
+array of 10 @code{double}s. This shows how to declare a
|
|
|
+multidimensional array in C (@pxref{Multidimensional Arrays}).
|
|
|
+
|
|
|
+All these declarations specify the array's length, which is needed in
|
|
|
+these cases in order to allocate storage for the array.
|
|
|
+
|
|
|
+@node Strings
|
|
|
+@section Strings
|
|
|
+@cindex string
|
|
|
+
|
|
|
+A string in C is a sequence of elements of type @code{char},
|
|
|
+terminated with the null character, the character with code zero.
|
|
|
+
|
|
|
+Programs often need to use strings with specific, fixed contents. To
|
|
|
+write one in a C program, use a @dfn{string constant} such as
|
|
|
+@code{"Take me to your leader!"}. The data type of a string constant
|
|
|
+is @code{char *}. For the full syntactic details of writing string
|
|
|
+constants, @ref{String Constants}.
|
|
|
+
|
|
|
+To declare a place to store a non-constant string, declare an array of
|
|
|
+@code{char}. Keep in mind that it must include one extra @code{char}
|
|
|
+for the terminating null. For instance,
|
|
|
+
|
|
|
+@example
|
|
|
+char text = @{ 'H', 'e', 'l', 'l', 'o', 0 @};
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+declares an array named @samp{text} with six elements---five letters
|
|
|
+and the terminating null character. An equivalent way to get the same
|
|
|
+result is this,
|
|
|
+
|
|
|
+@example
|
|
|
+char text = "Hello";
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+which copies the elements of the string constant, including @emph{its}
|
|
|
+terminating null character.
|
|
|
+
|
|
|
+@example
|
|
|
+char message[200];
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+declares an array long enough to hold a string of 199 ASCII characters
|
|
|
+plus the terminating null character.
|
|
|
+
|
|
|
+When you store a string into @code{message} be sure to check or prove
|
|
|
+that the length does not exceed its size. For example,
|
|
|
+
|
|
|
+@example
|
|
|
+void
|
|
|
+set_message (char *text)
|
|
|
+@{
|
|
|
+ int i;
|
|
|
+ for (i = 0; i < sizeof (message); i++)
|
|
|
+ @{
|
|
|
+ message[i] = text[i];
|
|
|
+ if (text[i] == 0)
|
|
|
+ return;
|
|
|
+ @}
|
|
|
+ fatal_error ("Message is too long for `message');
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+It's easy to do this with the standard library function
|
|
|
+@code{strncpy}, which fills out the whole destination array (up to a
|
|
|
+specified length) with null characters. Thus, if the last character
|
|
|
+of the destination is not null, the string did not fit. Many system
|
|
|
+libraries, including the GNU C library, hand-optimize @code{strncpy}
|
|
|
+to run faster than an explicit @code{for}-loop.
|
|
|
+
|
|
|
+Here's what the code looks like:
|
|
|
+
|
|
|
+@example
|
|
|
+void
|
|
|
+set_message (char *text)
|
|
|
+@{
|
|
|
+ strncpy (message, text, sizeof (message));
|
|
|
+ if (message[sizeof (message) - 1] != 0)
|
|
|
+ fatal_error ("Message is too long for `message');
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+@xref{String and Array Utilities, The GNU C Library, , libc, The GNU C
|
|
|
+Library Reference Manual}, for more information about the standard
|
|
|
+library functions for operating on strings.
|
|
|
+
|
|
|
+You can avoid putting a fixed length limit on strings you construct or
|
|
|
+operate on by allocating the space for them dynamically.
|
|
|
+@xref{Dynamic Memory Allocation}.
|
|
|
+
|
|
|
+@node Array Type Designators
|
|
|
+@section Array Type Designators
|
|
|
+
|
|
|
+Every C type has a type designator, which you make by deleting the
|
|
|
+variable name and the semicolon from a declaration (@pxref{Type
|
|
|
+Designators}). The designators for array types follow this rule, but
|
|
|
+they may appear surprising.
|
|
|
+
|
|
|
+@example
|
|
|
+@r{type} int a[5]; @r{designator} int [5]
|
|
|
+@r{type} double a[5][3]; @r{designator} double [5][3]
|
|
|
+@r{type} struct foo *a[5]; @r{designator} struct foo *[5]
|
|
|
+@end example
|
|
|
+
|
|
|
+@node Incomplete Array Types
|
|
|
+@section Incomplete Array Types
|
|
|
+@cindex incomplete array types
|
|
|
+@cindex array types, incomplete
|
|
|
+
|
|
|
+An array is equivalent, for most purposes, to a pointer to its zeroth
|
|
|
+element. When that is true, the length of the array is irrelevant.
|
|
|
+The length needs to be known only for allocating space for the array, or
|
|
|
+for @code{sizeof} and @code{typeof} (@pxref{Auto Type}). Thus, in some
|
|
|
+contexts C allows
|
|
|
+
|
|
|
+@itemize @bullet
|
|
|
+@item
|
|
|
+An @code{extern} declaration says how to refer to a variable allocated
|
|
|
+elsewhere. It does not need to allocate space for the variable,
|
|
|
+so if it is an array, you can omit the length. For example,
|
|
|
+
|
|
|
+@example
|
|
|
+extern int foo[];
|
|
|
+@end example
|
|
|
+
|
|
|
+@item
|
|
|
+When declaring a function parameter as an array, the argument value
|
|
|
+passed to the function is really a pointer to the array's zeroth
|
|
|
+element. This value does not say how long the array really is, there
|
|
|
+is no need to declare it. For example,
|
|
|
+
|
|
|
+@example
|
|
|
+int
|
|
|
+func (int foo[])
|
|
|
+@end example
|
|
|
+@end itemize
|
|
|
+
|
|
|
+These declarations are examples of @dfn{incomplete} array types, types
|
|
|
+that are not fully specified. The incompleteness makes no difference
|
|
|
+for accessing elements of the array, but it matters for some other
|
|
|
+things. For instance, @code{sizeof} is not allowed on an incomplete
|
|
|
+type.
|
|
|
+
|
|
|
+With multidimensional arrays, only the first dimension can be omitted:
|
|
|
+
|
|
|
+@example
|
|
|
+extern struct chesspiece *funnyboard foo[][8];
|
|
|
+@end example
|
|
|
+
|
|
|
+In other words, the code doesn't have to say how many rows there are,
|
|
|
+but it must state how big each row is.
|
|
|
+
|
|
|
+@node Limitations of C Arrays
|
|
|
+@section Limitations of C Arrays
|
|
|
+@cindex limitations of C arrays
|
|
|
+@cindex first-class object
|
|
|
+
|
|
|
+Arrays have quirks in C because they are not ``first-class objects'':
|
|
|
+there is no way in C to operate on an array as a unit.
|
|
|
+
|
|
|
+The other composite objects in C, structures and unions, are
|
|
|
+first-class objects: a C program can copy a structure or union value
|
|
|
+in an assignment, or pass one as an argument to a function, or make a
|
|
|
+function return one. You can't do those things with an array in C@.
|
|
|
+That is because a value you can operate on never has an array type.
|
|
|
+
|
|
|
+An expression in C can have an array type, but that doesn't produce
|
|
|
+the array as a value. Instead it is converted automatically to a
|
|
|
+pointer to the array's element at index zero. The code can operate
|
|
|
+on the pointer, and through that on individual elements of the array,
|
|
|
+but it can't get and operate on the array as a unit.
|
|
|
+
|
|
|
+There are three exceptions to this conversion rule, but none of them
|
|
|
+offers a way to operate on the array as a whole.
|
|
|
+
|
|
|
+First, @samp{&} applied to an expression with array type gives you the
|
|
|
+address of the array, as an array type. However, you can't operate on the
|
|
|
+whole array that way---if you apply @samp{*} to get the array back,
|
|
|
+that expression converts, as usual, to a pointer to its zeroth
|
|
|
+element.
|
|
|
+
|
|
|
+Second, the operators @code{sizeof}, @code{_Alignof}, and
|
|
|
+@code{typeof} do not convert the array to a pointer; they leave it as
|
|
|
+an array. But they don't operate on the array's data---they only give
|
|
|
+information about its type.
|
|
|
+
|
|
|
+Third, a string constant used as an initializer for an array is not
|
|
|
+converted to a pointer---rather, the declaration copies the
|
|
|
+@emph{contents} of that string in that one special case.
|
|
|
+
|
|
|
+You @emph{can} copy the contents of an array, just not with an
|
|
|
+assignment operator. You can do it by calling the library function
|
|
|
+@code{memcpy} or @code{memmove} (@pxref{Copying and Concatenation, The
|
|
|
+GNU C Library, , libc, The GNU C Library Reference Manual}). Also,
|
|
|
+when a structure contains just an array, you can copy that structure.
|
|
|
+
|
|
|
+An array itself is an lvalue if it is a declared variable, or part of
|
|
|
+a structure or union that is an lvalue. When you construct an array
|
|
|
+from elements (@pxref{Constructing Array Values}), that array is not
|
|
|
+an lvalue.
|
|
|
+
|
|
|
+@node Multidimensional Arrays
|
|
|
+@section Multidimensional Arrays
|
|
|
+@cindex multidimensional arrays
|
|
|
+@cindex array, multidimensional
|
|
|
+
|
|
|
+Strictly speaking, all arrays in C are unidimensional. However, you
|
|
|
+can create an array of arrays, which is more or less equivalent to a
|
|
|
+multidimensional array. For example,
|
|
|
+
|
|
|
+@example
|
|
|
+struct chesspiece *board[8][8];
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+declares an array of 8 arrays of 8 pointers to @code{struct
|
|
|
+chesspiece}. This data type could represent the state of a chess
|
|
|
+game. To access one square's contents requires two array index
|
|
|
+operations, one for each dimension. For instance, you can write
|
|
|
+@code{board[row][column]}, assuming @code{row} and @code{column}
|
|
|
+are variables with integer values in the proper range.
|
|
|
+
|
|
|
+How does C understand @code{board[row][column]}? First of all,
|
|
|
+@code{board} is converted automatically to a pointer to the zeroth
|
|
|
+element (at index zero) of @code{board}. Adding @code{row} to that
|
|
|
+makes it point to the desired element. Thus, @code{board[row]}'s
|
|
|
+value is an element of @code{board}---an array of 8 pointers.
|
|
|
+
|
|
|
+However, as an expression with array type, it is converted
|
|
|
+automatically to a pointer to the array's zeroth element. The second
|
|
|
+array index operation, @code{[column]}, accesses the chosen element
|
|
|
+from that array.
|
|
|
+
|
|
|
+As this shows, pointer-to-array types are meaningful in C@.
|
|
|
+You can declare a variable that points to a row in a chess board
|
|
|
+like this:
|
|
|
+
|
|
|
+@example
|
|
|
+struct chesspiece *(*rowptr)[8];
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+This points to an array of 8 pointers to @code{struct chesspiece}.
|
|
|
+You can assign to it as follows:
|
|
|
+
|
|
|
+@example
|
|
|
+rowptr = &board[5];
|
|
|
+@end example
|
|
|
+
|
|
|
+The dimensions don't have to be equal in length. Here we declare
|
|
|
+@code{statepop} as an array to hold the population of each state in
|
|
|
+the United States for each year since 1900:
|
|
|
+
|
|
|
+@example
|
|
|
+#define NSTATES 50
|
|
|
+@{
|
|
|
+ int nyears = current_year - 1900 + 1;
|
|
|
+ int statepop[NSTATES][nyears];
|
|
|
+ @r{@dots{}}
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+The variable @code{statepop} is an array of @code{NSTATES} subarrays,
|
|
|
+each indexed by the year (counting from 1900). Thus, to get the
|
|
|
+element for a particular state and year, we must subscript it first
|
|
|
+by the number that indicates the state, and second by the index for
|
|
|
+the year:
|
|
|
+
|
|
|
+@example
|
|
|
+statepop[state][year - 1900]
|
|
|
+@end example
|
|
|
+
|
|
|
+@cindex array, layout in memory
|
|
|
+The subarrays within the multidimensional array are allocated
|
|
|
+consecutively in memory, and within each subarray, its elements are
|
|
|
+allocated consecutively in memory. The most efficient way to process
|
|
|
+all the elements in the array is to scan the last subscript in the
|
|
|
+innermost loop. This means consecutive accesses go to consecutive
|
|
|
+memory locations, which optimizes use of the processor's memory cache.
|
|
|
+For example:
|
|
|
+
|
|
|
+@example
|
|
|
+int total = 0;
|
|
|
+float average;
|
|
|
+
|
|
|
+for (int state = 0; state < NSTATES, ++state)
|
|
|
+ @{
|
|
|
+ for (int year = 0; year < nyears; ++year)
|
|
|
+ @{
|
|
|
+ total += statepop[state][year];
|
|
|
+ @}
|
|
|
+ @}
|
|
|
+
|
|
|
+average = total / nyears;
|
|
|
+@end example
|
|
|
+
|
|
|
+C's layout for multidimensional arrays is different from Fortran's
|
|
|
+layout. In Fortran, a multidimensional array is not an array of
|
|
|
+arrays; rather, multidimensional arrays are a primitive feature, and
|
|
|
+it is the first index that varies most rapidly between consecutive
|
|
|
+memory locations. Thus, the memory layout of a 50x114 array in C
|
|
|
+matches that of a 114x50 array in Fortran.
|
|
|
+
|
|
|
+@node Constructing Array Values
|
|
|
+@section Constructing Array Values
|
|
|
+@cindex constructing array values
|
|
|
+@cindex array values, constructing
|
|
|
+
|
|
|
+You can construct an array from elements by writing them inside
|
|
|
+braces, and preceding all that with the array type's designator in
|
|
|
+parentheses. There is no need to specify the array length, since the
|
|
|
+number of elements determines that. The constructor looks like this:
|
|
|
+
|
|
|
+@example
|
|
|
+(@var{elttype}[]) @{ @var{elements} @};
|
|
|
+@end example
|
|
|
+
|
|
|
+Here is an example, which constructs an array of string pointers:
|
|
|
+
|
|
|
+@example
|
|
|
+(char *[]) @{ "x", "y", "z" @};
|
|
|
+@end example
|
|
|
+
|
|
|
+That's equivalent in effect to declaring an array with the same
|
|
|
+initializer, like this:
|
|
|
+
|
|
|
+@example
|
|
|
+char *array[] = @{ "x", "y", "z" @};
|
|
|
+@end example
|
|
|
+
|
|
|
+and then using the array.
|
|
|
+
|
|
|
+If all the elements are simple constant expressions, or made up of
|
|
|
+such, then the compound literal can be coerced to a pointer to its
|
|
|
+zeroth element and used to initialize a file-scope variable
|
|
|
+(@pxref{File-Scope Variables}), as shown here:
|
|
|
+
|
|
|
+@example
|
|
|
+char **foo = (char *[]) @{ "x", "y", "z" @};
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+The data type of @code{foo} is @code{char **}, which is a pointer
|
|
|
+type, not an array type. The declaration is equivalent to defining
|
|
|
+and then using an array-type variable:
|
|
|
+
|
|
|
+@example
|
|
|
+char *nameless_array[] = @{ "x", "y", "z" @};
|
|
|
+char **foo = &nameless_array[0];
|
|
|
+@end example
|
|
|
+
|
|
|
+@node Arrays of Variable Length
|
|
|
+@section Arrays of Variable Length
|
|
|
+@cindex array of variable length
|
|
|
+@cindex variable-length arrays
|
|
|
+
|
|
|
+In GNU C, you can declare variable-length arrays like any other
|
|
|
+arrays, but with a length that is not a constant expression. The
|
|
|
+storage is allocated at the point of declaration and deallocated when
|
|
|
+the block scope containing the declaration exits. For example:
|
|
|
+
|
|
|
+@example
|
|
|
+#include <stdio.h> /* @r{Defines @code{FILE}.} */
|
|
|
+#include <string.h> /* @r{Declares @code{str}.} */
|
|
|
+
|
|
|
+FILE *
|
|
|
+concat_fopen (char *s1, char *s2, char *mode)
|
|
|
+@{
|
|
|
+ char str[strlen (s1) + strlen (s2) + 1];
|
|
|
+ strcpy (str, s1);
|
|
|
+ strcat (str, s2);
|
|
|
+ return fopen (str, mode);
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+(This uses some standard library functions; see @ref{String and Array
|
|
|
+Utilities, , , libc, The GNU C Library Reference Manual}.)
|
|
|
+
|
|
|
+The length of an array is computed once when the storage is allocated
|
|
|
+and is remembered for the scope of the array in case it is used in
|
|
|
+@code{sizeof}.
|
|
|
+
|
|
|
+@strong{Warning:} don't allocate a variable-length array if the size
|
|
|
+might be very large (more than 100,000), or in a recursive function,
|
|
|
+because that is likely to cause stack overflow. Allocate the array
|
|
|
+dynamically instead (@pxref{Dynamic Memory Allocation}).
|
|
|
+
|
|
|
+Jumping or breaking out of the scope of the array name deallocates the
|
|
|
+storage. Jumping into the scope is not allowed; that gives an error
|
|
|
+message.
|
|
|
+
|
|
|
+You can also use variable-length arrays as arguments to functions:
|
|
|
+
|
|
|
+@example
|
|
|
+struct entry
|
|
|
+tester (int len, char data[len][len])
|
|
|
+@{
|
|
|
+ @r{@dots{}}
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+As usual, a function argument declared with an array type
|
|
|
+is really a pointer to an array that already exists.
|
|
|
+Calling the function does not allocate the array, so there's no
|
|
|
+particular danger of stack overflow in using this construct.
|
|
|
+
|
|
|
+To pass the array first and the length afterward, use a forward
|
|
|
+declaration in the function's parameter list (another GNU extension).
|
|
|
+For example,
|
|
|
+
|
|
|
+@example
|
|
|
+struct entry
|
|
|
+tester (int len; char data[len][len], int len)
|
|
|
+@{
|
|
|
+ @r{@dots{}}
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+The @code{int len} before the semicolon is a @dfn{parameter forward
|
|
|
+declaration}, and it serves the purpose of making the name @code{len}
|
|
|
+known when the declaration of @code{data} is parsed.
|
|
|
+
|
|
|
+You can write any number of such parameter forward declarations in the
|
|
|
+parameter list. They can be separated by commas or semicolons, but
|
|
|
+the last one must end with a semicolon, which is followed by the
|
|
|
+``real'' parameter declarations. Each forward declaration must match
|
|
|
+a ``real'' declaration in parameter name and data type. ISO C11 does
|
|
|
+not support parameter forward declarations.
|
|
|
+
|
|
|
+@node Enumeration Types
|
|
|
+@chapter Enumeration Types
|
|
|
+@cindex enumeration types
|
|
|
+@cindex types, enumeration
|
|
|
+@cindex enumerator
|
|
|
+
|
|
|
+An @dfn{enumeration type} represents a limited set of integer values,
|
|
|
+each with a name. It is effectively equivalent to a primitive integer
|
|
|
+type.
|
|
|
+
|
|
|
+Suppose we have a list of possible emotional states to store in an
|
|
|
+integer variable. We can give names to these alternative values with
|
|
|
+an enumeration:
|
|
|
+
|
|
|
+@example
|
|
|
+enum emotion_state @{ neutral, happy, sad, worried,
|
|
|
+ calm, nervous @};
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+(Never mind that this is a simplistic way to classify emotional states;
|
|
|
+it's just a code example.)
|
|
|
+
|
|
|
+The names inside the enumeration are called @dfn{enumerators}. The
|
|
|
+enumeration type defines them as constants, and their values are
|
|
|
+consecutive integers; @code{neutral} is 0, @code{happy} is 1,
|
|
|
+@code{sad} is 2, and so on. Alternatively, you can specify values for
|
|
|
+the enumerators explicitly like this:
|
|
|
+
|
|
|
+@example
|
|
|
+enum emotion_state @{ neutral = 2, happy = 5,
|
|
|
+ sad = 20, worried = 10,
|
|
|
+ calm = -5, nervous = -300 @};
|
|
|
+@end example
|
|
|
+
|
|
|
+Each enumerator which does not specify a value gets value zero
|
|
|
+(if it is at the beginning) or the next consecutive integer.
|
|
|
+
|
|
|
+@example
|
|
|
+/* @r{@code{neutral} is 0 by default,}
|
|
|
+ @r{and @code{worried} is 21 by default.} */
|
|
|
+enum emotion_state @{ neutral,
|
|
|
+ happy = 5, sad = 20, worried,
|
|
|
+ calm = -5, nervous = -300 @};
|
|
|
+@end example
|
|
|
+
|
|
|
+If an enumerator is obsolete, you can specify that using it should
|
|
|
+cause a warning, by including an attribute in the enumerator's
|
|
|
+declaration. Here is how @code{happy} would look with this
|
|
|
+attribute:
|
|
|
+
|
|
|
+@example
|
|
|
+happy __attribute__
|
|
|
+ ((deprecated
|
|
|
+ ("impossible under plutocratic rule")))
|
|
|
+ = 5,
|
|
|
+@end example
|
|
|
+
|
|
|
+@xref{Attributes}.
|
|
|
+
|
|
|
+You can declare variables with the enumeration type:
|
|
|
+
|
|
|
+@example
|
|
|
+enum emotion_state feelings_now;
|
|
|
+@end example
|
|
|
+
|
|
|
+In the C code itself, this is equivalent to declaring the variable
|
|
|
+@code{int}. (If all the enumeration values are positive, it is
|
|
|
+equivalent to @code{unsigned int}.) However, declaring it with the
|
|
|
+enumeration type has an advantage in debugging, because GDB knows it
|
|
|
+should display the current value of the variable using the
|
|
|
+corresponding name. If the variable's type is @code{int}, GDB can
|
|
|
+only show the value as a number.
|
|
|
+
|
|
|
+The identifier that follows @code{enum} is called a @dfn{type tag}
|
|
|
+since it distinguishes different enumeration types. Type tags are in
|
|
|
+a separate name space and belong to scopes like most other names in C@.
|
|
|
+@xref{Type Tags}, for explanation.
|
|
|
+
|
|
|
+You can predeclare an @code{enum} type tag like a structure or union
|
|
|
+type tag, like this:
|
|
|
+
|
|
|
+@example
|
|
|
+enum foo;
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+The @code{enum} type is incomplete until you finish defining it.
|
|
|
+
|
|
|
+You can optionally include a trailing comma at the end of a list of
|
|
|
+enumeration values:
|
|
|
+
|
|
|
+@example
|
|
|
+enum emotion_state @{ neutral, happy, sad, worried,
|
|
|
+ calm, nervous, @};
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+This is useful in some macro definitions, since it enables you to
|
|
|
+assemble the list of enumerators without knowing which one is last.
|
|
|
+The extra comma does not change the meaning of the enumeration in any
|
|
|
+way.
|
|
|
+
|
|
|
+@node Defining Typedef Names
|
|
|
+@chapter Defining Typedef Names
|
|
|
+@cindex typedef names
|
|
|
+@findex typedef
|
|
|
+
|
|
|
+You can define a data type keyword as an alias for any type, and then
|
|
|
+use the alias syntactically like a built-in type keyword such as
|
|
|
+@code{int}. You do this using @code{typedef}, so these aliases are
|
|
|
+also called @dfn{typedef names}.
|
|
|
+
|
|
|
+@code{typedef} is followed by text that looks just like a variable
|
|
|
+declaration, but instead of declaring variables it defines data type
|
|
|
+keywords.
|
|
|
+
|
|
|
+Here's how to define @code{fooptr} as a typedef alias for the type
|
|
|
+@code{struct foo *}, then declare @code{x} and @code{y} as variables
|
|
|
+with that type:
|
|
|
+
|
|
|
+@example
|
|
|
+typedef struct foo *fooptr;
|
|
|
+
|
|
|
+fooptr x, y;
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+That declaration is equivalent to the following one:
|
|
|
+
|
|
|
+@example
|
|
|
+struct foo *x, *y;
|
|
|
+@end example
|
|
|
+
|
|
|
+You can define a typedef alias for any type. For instance, this makes
|
|
|
+@code{frobcount} an alias for type @code{int}:
|
|
|
+
|
|
|
+@example
|
|
|
+typedef int frobcount;
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+This doesn't define a new type distinct from @code{int}. Rather,
|
|
|
+@code{frobcount} is another name for the type @code{int}. Once the
|
|
|
+variable is declared, it makes no difference which name the
|
|
|
+declaration used.
|
|
|
+
|
|
|
+There is a syntactic difference, however, between @code{frobcount} and
|
|
|
+@code{int}: A typedef name cannot be used with
|
|
|
+@code{signed}, @code{unsigned}, @code{long} or @code{short}. It has
|
|
|
+to specify the type all by itself. So you can't write this:
|
|
|
+
|
|
|
+@example
|
|
|
+unsigned frobcount f1; /* @r{Error!} */
|
|
|
+@end example
|
|
|
+
|
|
|
+But you can write this:
|
|
|
+
|
|
|
+@example
|
|
|
+typedef unsigned int unsigned_frobcount;
|
|
|
+
|
|
|
+unsigned_frobcount f1;
|
|
|
+@end example
|
|
|
+
|
|
|
+In other words, a typedef name is not an alias for @emph{a keyword}
|
|
|
+such as @code{int}. It stands for a @emph{type}, and that could be
|
|
|
+the type @code{int}.
|
|
|
+
|
|
|
+Typedef names are in the same namespace as functions and variables, so
|
|
|
+you can't use the same name for a typedef and a function, or a typedef
|
|
|
+and a variable. When a typedef is declared inside a code block, it is
|
|
|
+in scope only in that block.
|
|
|
+
|
|
|
+@strong{Warning:} Avoid defining typedef names that end in @samp{_t},
|
|
|
+because many of these have standard meanings.
|
|
|
+
|
|
|
+You can redefine a typedef name to the exact same type as its first
|
|
|
+definition, but you cannot redefine a typedef name to a
|
|
|
+different type, even if the two types are compatible. For example, this
|
|
|
+is valid:
|
|
|
+
|
|
|
+@example
|
|
|
+typedef int frobcount;
|
|
|
+typedef int frotzcount;
|
|
|
+typedef frotzcount frobcount;
|
|
|
+typedef frobcount frotzcount;
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+because each typedef name is always defined with the same type
|
|
|
+(@code{int}), but this is not valid:
|
|
|
+
|
|
|
+@example
|
|
|
+enum foo @{f1, f2, f3@};
|
|
|
+typedef enum foo frobcount;
|
|
|
+typedef int frobcount;
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+Even though the type @code{enum foo} is compatible with @code{int},
|
|
|
+they are not the @emph{same} type.
|
|
|
+
|
|
|
+@node Statements
|
|
|
+@chapter Statements
|
|
|
+@cindex statements
|
|
|
+
|
|
|
+A @dfn{statement} specifies computations to be done for effect; it
|
|
|
+does not produce a value, as an expression would. In general a
|
|
|
+statement ends with a semicolon (@samp{;}), but blocks (which are
|
|
|
+statements, more or less) are an exception to that rule.
|
|
|
+@ifnottex
|
|
|
+@xref{Blocks}.
|
|
|
+@end ifnottex
|
|
|
+
|
|
|
+The places to use statements are inside a block, and inside a
|
|
|
+complex statement. A @dfn{complex statement} contains one or two
|
|
|
+components that are nested statements. Each such component must
|
|
|
+consist of one and only one statement. The way to put multiple
|
|
|
+statements in such a component is to group them into a @dfn{block}
|
|
|
+(@pxref{Blocks}), which counts as one statement.
|
|
|
+
|
|
|
+The following sections describe the various kinds of statement.
|
|
|
+
|
|
|
+@menu
|
|
|
+* Expression Statement:: Evaluate an expression, as a statement,
|
|
|
+ usually done for a side effect.
|
|
|
+* if Statement:: Basic conditional execution.
|
|
|
+* if-else Statement:: Multiple branches for conditional execution.
|
|
|
+* Blocks:: Grouping multiple statements together.
|
|
|
+* return Statement:: Return a value from a function.
|
|
|
+* Loop Statements:: Repeatedly executing a statement or block.
|
|
|
+* switch Statement:: Multi-way conditional choices.
|
|
|
+* switch Example:: A plausible example of using @code{switch}.
|
|
|
+* Duffs Device:: A special way to use @code{switch}.
|
|
|
+* Case Ranges:: Ranges of values for @code{switch} cases.
|
|
|
+* Null Statement:: A statement that does nothing.
|
|
|
+* goto Statement:: Jump to another point in the source code,
|
|
|
+ identified by a label.
|
|
|
+* Local Labels:: Labels with limited scope.
|
|
|
+* Labels as Values:: Getting the address of a label.
|
|
|
+* Statement Exprs:: A series of statements used as an expression.
|
|
|
+@end menu
|
|
|
+
|
|
|
+@node Expression Statement
|
|
|
+@section Expression Statement
|
|
|
+@cindex expression statement
|
|
|
+@cindex statement, expression
|
|
|
+
|
|
|
+The most common kind of statement in C is an @dfn{expression statement}.
|
|
|
+It consists of an expression followed by a
|
|
|
+semicolon. The expression's value is discarded, so the expressions
|
|
|
+that are useful are those that have side effects: assignment
|
|
|
+expressions, increment and decrement expressions, and function calls.
|
|
|
+Here are examples of expression statements:
|
|
|
+
|
|
|
+@smallexample
|
|
|
+x = 5; /* @r{Assignment expression.} */
|
|
|
+p++; /* @r{Increment expression.} */
|
|
|
+printf ("Done\n"); /* @r{Function call expression.} */
|
|
|
+*p; /* @r{Cause @code{SIGSEGV} signal if @code{p} is null.} */
|
|
|
+x + y; /* @r{Useless statement without effect.} */
|
|
|
+@end smallexample
|
|
|
+
|
|
|
+In very unusual circumstances we use an expression statement
|
|
|
+whose purpose is to get a fault if an address is invalid:
|
|
|
+
|
|
|
+@smallexample
|
|
|
+volatile char *p;
|
|
|
+@r{@dots{}}
|
|
|
+*p; /* @r{Cause signal if @code{p} is null.} */
|
|
|
+@end smallexample
|
|
|
+
|
|
|
+If the target of @code{p} is not declared @code{volatile}, the
|
|
|
+compiler might optimize away the memory access, since it knows that
|
|
|
+the value isn't really used. @xref{volatile}.
|
|
|
+
|
|
|
+@node if Statement
|
|
|
+@section @code{if} Statement
|
|
|
+@cindex @code{if} statement
|
|
|
+@cindex statement, @code{if}
|
|
|
+@findex if
|
|
|
+
|
|
|
+An @code{if} statement computes an expression to decide
|
|
|
+whether to execute the following statement or not.
|
|
|
+It looks like this:
|
|
|
+
|
|
|
+@example
|
|
|
+if (@var{condition})
|
|
|
+ @var{execute-if-true}
|
|
|
+@end example
|
|
|
+
|
|
|
+The first thing this does is compute the value of @var{condition}. If
|
|
|
+that is true (nonzero), then it executes the statement
|
|
|
+@var{execute-if-true}. If the value of @var{condition} is false
|
|
|
+(zero), it doesn't execute @var{execute-if-true}; instead, it does
|
|
|
+nothing.
|
|
|
+
|
|
|
+This is a @dfn{complex statement} because it contains a component
|
|
|
+@var{if-true-substatement} that is a nested statement. It must be one
|
|
|
+and only one statement. The way to put multiple statements there is
|
|
|
+to group them into a @dfn{block} (@pxref{Blocks}).
|
|
|
+
|
|
|
+@node if-else Statement
|
|
|
+@section @code{if-else} Statement
|
|
|
+@cindex @code{if}@dots{}@code{else} statement
|
|
|
+@cindex statement, @code{if}@dots{}@code{else}
|
|
|
+@findex else
|
|
|
+
|
|
|
+An @code{if}-@code{else} statement computes an expression to decide
|
|
|
+which of two nested statements to execute.
|
|
|
+It looks like this:
|
|
|
+
|
|
|
+@example
|
|
|
+if (@var{condition})
|
|
|
+ @var{if-true-substatement}
|
|
|
+else
|
|
|
+ @var{if-false-substatement}
|
|
|
+@end example
|
|
|
+
|
|
|
+The first thing this does is compute the value of @var{condition}. If
|
|
|
+that is true (nonzero), then it executes the statement
|
|
|
+@var{if-true-substatement}. If the value of @var{condition} is false
|
|
|
+(zero), then it executes the statement @var{if-false-substatement} instead.
|
|
|
+
|
|
|
+This is a @dfn{complex statement} because it contains components
|
|
|
+@var{if-true-substatement} and @var{if-else-substatement} that are
|
|
|
+nested statements. Each must be one and only one statement. The way
|
|
|
+to put multiple statements in such a component is to group them into a
|
|
|
+@dfn{block} (@pxref{Blocks}).
|
|
|
+
|
|
|
+@node Blocks
|
|
|
+@section Blocks
|
|
|
+@cindex block
|
|
|
+@cindex compound statement
|
|
|
+
|
|
|
+A @dfn{block} is a construct that contains multiple statements of any
|
|
|
+kind. It begins with @samp{@{} and ends with @samp{@}}, and has a
|
|
|
+series of statements and declarations in between. Another name for
|
|
|
+blocks is @dfn{compound statements}.
|
|
|
+
|
|
|
+Is a block a statement? Yes and no. It doesn't @emph{look} like a
|
|
|
+normal statement---it does not end with a semicolon. But you can
|
|
|
+@emph{use} it like a statement; anywhere that a statement is required
|
|
|
+or allowed, you can write a block and consider that block a statement.
|
|
|
+
|
|
|
+So far it seems that a block is a kind of statement with an unusual
|
|
|
+syntax. But that is not entirely true: a function body is also a
|
|
|
+block, and that block is definitely not a statement. The text after a
|
|
|
+function header is not treated as a statement; only a function body is
|
|
|
+allowed there, and nothing else would be meaningful there.
|
|
|
+
|
|
|
+In a formal grammar we would have to choose---either a block is a kind
|
|
|
+of statement or it is not. But this manual is meant for humans, not
|
|
|
+for parser generators. The clearest answer for humans is, ``a block
|
|
|
+is a statement, in some ways.''
|
|
|
+
|
|
|
+@cindex nested block
|
|
|
+@cindex internal block
|
|
|
+A block that isn't a function body is called an @dfn{internal block}
|
|
|
+or a @dfn{nested block}. You can put a nested block directly inside
|
|
|
+another block, but more often the nested block is inside some complex
|
|
|
+statement, such as a @code{for} statement or an @code{if} statement.
|
|
|
+
|
|
|
+There are two uses for nested blocks in C:
|
|
|
+
|
|
|
+@itemize @bullet
|
|
|
+@item
|
|
|
+To specify the scope for local declarations. For instance, a local
|
|
|
+variable's scope is the rest of the innermost containing block.
|
|
|
+
|
|
|
+@item
|
|
|
+To write a series of statements where, syntactically, one statement is
|
|
|
+called for. For instance, the @var{execute-if-true} of an @code{if}
|
|
|
+statement is one statement. To put multiple statements there, they
|
|
|
+have to be wrapped in a block, like this:
|
|
|
+
|
|
|
+@example
|
|
|
+if (x < 0)
|
|
|
+ @{
|
|
|
+ printf ("x was negative\n");
|
|
|
+ x = -x;
|
|
|
+ @}
|
|
|
+@end example
|
|
|
+@end itemize
|
|
|
+
|
|
|
+This example (repeated from above) shows a nested block which serves
|
|
|
+both purposes: it includes two statements (plus a declaration) in the
|
|
|
+body of a @code{while} statement, and it provides the scope for the
|
|
|
+declaration of @code{q}.
|
|
|
+
|
|
|
+@example
|
|
|
+void
|
|
|
+free_intlist (struct intlistlink *p)
|
|
|
+@{
|
|
|
+ while (p)
|
|
|
+ @{
|
|
|
+ struct intlistlink *q = p;
|
|
|
+ p = p->next;
|
|
|
+ free (q);
|
|
|
+ @}
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+@node return Statement
|
|
|
+@section @code{return} Statement
|
|
|
+@cindex @code{return} statement
|
|
|
+@cindex statement, @code{return}
|
|
|
+@findex return
|
|
|
+
|
|
|
+The @code{return} statement makes the containing function return
|
|
|
+immediately. It has two forms. This one specifies no value to
|
|
|
+return:
|
|
|
+
|
|
|
+@example
|
|
|
+return;
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+That form is meant for functions whose return type is @code{void}
|
|
|
+(@pxref{The Void Type}). You can also use it in a function that
|
|
|
+returns nonvoid data, but that's a bad idea, since it makes the
|
|
|
+function return garbage.
|
|
|
+
|
|
|
+The form that specifies a value looks like this:
|
|
|
+
|
|
|
+@example
|
|
|
+return @var{value};
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+which computes the expression @var{value} and makes the function
|
|
|
+return that. If necessary, the value undergoes type conversion to
|
|
|
+the function's declared return value type, which works like
|
|
|
+assigning the value to a variable of that type.
|
|
|
+
|
|
|
+@node Loop Statements
|
|
|
+@section Loop Statements
|
|
|
+@cindex loop statements
|
|
|
+@cindex statements, loop
|
|
|
+@cindex iteration
|
|
|
+
|
|
|
+You can use a loop statement when you need to execute a series of
|
|
|
+statements repeatedly, making an @dfn{iteration}. C provides several
|
|
|
+different kinds of loop statements, described in the following
|
|
|
+subsections.
|
|
|
+
|
|
|
+Every kind of loop statement is a complex statement because contains a
|
|
|
+component, here called @var{body}, which is a nested statement.
|
|
|
+Most often the body is a block.
|
|
|
+
|
|
|
+@menu
|
|
|
+* while Statement:: Loop as long as a test expression is true.
|
|
|
+* do-while Statement:: Execute a loop once, with further looping
|
|
|
+ as long as a test expression is true.
|
|
|
+* break Statement:: End a loop immediately.
|
|
|
+* for Statement:: Iterative looping.
|
|
|
+* Example of for:: An example of iterative looping.
|
|
|
+* Omitted for-Expressions:: for-loop expression options.
|
|
|
+* for-Index Declarations:: for-loop declaration options.
|
|
|
+* continue Statement:: Begin the next cycle of a loop.
|
|
|
+@end menu
|
|
|
+
|
|
|
+@node while Statement
|
|
|
+@subsection @code{while} Statement
|
|
|
+@cindex @code{while} statement
|
|
|
+@cindex statement, @code{while}
|
|
|
+@findex while
|
|
|
+
|
|
|
+The @code{while} statement is the simplest loop construct.
|
|
|
+It looks like this:
|
|
|
+
|
|
|
+@example
|
|
|
+while (@var{test})
|
|
|
+ @var{body}
|
|
|
+@end example
|
|
|
+
|
|
|
+Here, @var{body} is a statement (often a nested block) to repeat, and
|
|
|
+@var{test} is the test expression that controls whether to repeat it again.
|
|
|
+Each iteration of the loop starts by computing @var{test} and, if it
|
|
|
+is true (nonzero), that means the loop should execute @var{body} again
|
|
|
+and then start over.
|
|
|
+
|
|
|
+Here's an example of advancing to the last structure in a chain of
|
|
|
+structures chained through the @code{next} field:
|
|
|
+
|
|
|
+@example
|
|
|
+#include <stddef.h> /* @r{Defines @code{NULL}.} */
|
|
|
+@r{@dots{}}
|
|
|
+while (chain->next != NULL)
|
|
|
+ chain = chain->next;
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+This code assumes the chain isn't empty to start with; if the chain is
|
|
|
+empty (that is, if @code{chain} is a null pointer), the code gets a
|
|
|
+@code{SIGSEGV} signal trying to dereference that null pointer (@pxref{Signals}).
|
|
|
+
|
|
|
+@node do-while Statement
|
|
|
+@subsection @code{do-while} Statement
|
|
|
+@cindex @code{do}--@code{while} statement
|
|
|
+@cindex statement, @code{do}--@code{while}
|
|
|
+@findex do
|
|
|
+
|
|
|
+The @code{do}--@code{while} statement is a simple loop construct that
|
|
|
+performs the test at the end of the iteration.
|
|
|
+
|
|
|
+@example
|
|
|
+do
|
|
|
+ @var{body}
|
|
|
+while (@var{test});
|
|
|
+@end example
|
|
|
+
|
|
|
+Here, @var{body} is a statement (possibly a block) to repeat, and
|
|
|
+@var{test} is an expression that controls whether to repeat it again.
|
|
|
+
|
|
|
+Each iteration of the loop starts by executing @var{body}. Then it
|
|
|
+computes @var{test} and, if it is true (nonzero), that means to go
|
|
|
+back and start over with @var{body}. If @var{test} is false (zero),
|
|
|
+then the loop stops repeating and execution moves on past it.
|
|
|
+
|
|
|
+@node break Statement
|
|
|
+@subsection @code{break} Statement
|
|
|
+@cindex @code{break} statement
|
|
|
+@cindex statement, @code{break}
|
|
|
+@findex break
|
|
|
+
|
|
|
+The @code{break} statement looks like @samp{break;}. Its effect is to
|
|
|
+exit immediately from the innermost loop construct or @code{switch}
|
|
|
+statement (@pxref{switch Statement}).
|
|
|
+
|
|
|
+For example, this loop advances @code{p} until the next null
|
|
|
+character or newline.
|
|
|
+
|
|
|
+@example
|
|
|
+while (*p)
|
|
|
+ @{
|
|
|
+ /* @r{End loop if we have reached a newline.} */
|
|
|
+ if (*p == '\n')
|
|
|
+ break;
|
|
|
+ p++
|
|
|
+ @}
|
|
|
+@end example
|
|
|
+
|
|
|
+When there are nested loops, the @code{break} statement exits from the
|
|
|
+innermost loop containing it.
|
|
|
+
|
|
|
+@example
|
|
|
+struct list_if_tuples
|
|
|
+@{
|
|
|
+ struct list_if_tuples next;
|
|
|
+ int length;
|
|
|
+ data *contents;
|
|
|
+@};
|
|
|
+
|
|
|
+void
|
|
|
+process_all_elements (struct list_if_tuples *list)
|
|
|
+@{
|
|
|
+ while (list)
|
|
|
+ @{
|
|
|
+ /* @r{Process all the elements in this node's vector,}
|
|
|
+ @r{stopping when we reach one that is null.} */
|
|
|
+ for (i = 0; i < list->length; i++
|
|
|
+ @{
|
|
|
+ /* @r{Null element terminates this node's vector.} */
|
|
|
+ if (list->contents[i] == NULL)
|
|
|
+ /* @r{Exit the @code{for} loop.} */
|
|
|
+ break;
|
|
|
+ /* @r{Operate on the next element.} */
|
|
|
+ process_element (list->contents[i]);
|
|
|
+ @}
|
|
|
+
|
|
|
+ list = list->next;
|
|
|
+ @}
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+The only way in C to exit from an outer loop is with
|
|
|
+@code{goto} (@pxref{goto Statement}).
|
|
|
+
|
|
|
+@node for Statement
|
|
|
+@subsection @code{for} Statement
|
|
|
+@cindex @code{for} statement
|
|
|
+@cindex statement, @code{for}
|
|
|
+@findex for
|
|
|
+
|
|
|
+A @code{for} statement uses three expressions written inside a
|
|
|
+parenthetical group to define the repetition of the loop. The first
|
|
|
+expression says how to prepare to start the loop. The second says how
|
|
|
+to test, before each iteration, whether to continue looping. The
|
|
|
+third says how to advance, at the end of an iteration, for the next
|
|
|
+iteration. All together, it looks like this:
|
|
|
+
|
|
|
+@example
|
|
|
+for (@var{start}; @var{continue-test}; @var{advance})
|
|
|
+ @var{body}
|
|
|
+@end example
|
|
|
+
|
|
|
+The first thing the @code{for} statement does is compute @var{start}.
|
|
|
+The next thing it does is compute the expression @var{continue-test}.
|
|
|
+If that expression is false (zero), the @code{for} statement finishes
|
|
|
+immediately, so @var{body} is executed zero times.
|
|
|
+
|
|
|
+However, if @var{continue-test} is true (nonzero), the @code{for}
|
|
|
+statement executes @var{body}, then @var{advance}. Then it loops back
|
|
|
+to the not-quite-top to test @var{continue-test} again. But it does
|
|
|
+not compute @var{start} again.
|
|
|
+
|
|
|
+@node Example of for
|
|
|
+@subsection Example of @code{for}
|
|
|
+
|
|
|
+Here is the @code{for} statement from the iterative Fibonacci
|
|
|
+function:
|
|
|
+
|
|
|
+@example
|
|
|
+int i;
|
|
|
+for (i = 1; i < n; ++i)
|
|
|
+ /* @r{If @code{n} is 1 or less, the loop runs zero times,} */
|
|
|
+ /* @r{since @code{i < n} is false the first time.} */
|
|
|
+ @{
|
|
|
+ /* @r{Now @var{last} is @code{fib (@var{i})}}
|
|
|
+ @r{and @var{prev} is @code{fib (@var{i} @minus{} 1)}.} */
|
|
|
+ /* @r{Compute @code{fib (@var{i} + 1)}.} */
|
|
|
+ int next = prev + last;
|
|
|
+ /* @r{Shift the values down.} */
|
|
|
+ prev = last;
|
|
|
+ last = next;
|
|
|
+ /* @r{Now @var{last} is @code{fib (@var{i} + 1)}}
|
|
|
+ @r{and @var{prev} is @code{fib (@var{i})}.}
|
|
|
+ @r{But that won't stay true for long,}
|
|
|
+ @r{because we are about to increment @var{i}.} */
|
|
|
+ @}
|
|
|
+@end example
|
|
|
+
|
|
|
+In this example, @var{start} is @code{i = 1}, meaning set @code{i} to
|
|
|
+1. @var{continue-test} is @code{i < n}, meaning keep repeating the
|
|
|
+loop as long as @code{i} is less than @code{n}. @var{advance} is
|
|
|
+@code{i++}, meaning increment @code{i} by 1. The body is a block
|
|
|
+that contains a declaration and two statements.
|
|
|
+
|
|
|
+@node Omitted for-Expressions
|
|
|
+@subsection Omitted @code{for}-Expressions
|
|
|
+
|
|
|
+A fully-fleshed @code{for} statement contains all these parts,
|
|
|
+
|
|
|
+@example
|
|
|
+for (@var{start}; @var{continue-test}; @var{advance})
|
|
|
+ @var{body}
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+but you can omit any of the three expressions inside the parentheses.
|
|
|
+The parentheses and the two semicolons are required syntactically, but
|
|
|
+the expressions between them may be missing. A missing expression
|
|
|
+means this loop doesn't use that particular feature of the @code{for}
|
|
|
+statement.
|
|
|
+
|
|
|
+Instead of using @var{start}, you can do the loop preparation
|
|
|
+before the @code{for} statement: the effect is the same. So we
|
|
|
+could have written the beginning of the previous example this way:
|
|
|
+
|
|
|
+@example
|
|
|
+int i = 0;
|
|
|
+for (; i < n; ++i)
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+instead of this way:
|
|
|
+
|
|
|
+@example
|
|
|
+int i;
|
|
|
+for (i = 0; i < n; ++i)
|
|
|
+@end example
|
|
|
+
|
|
|
+Omitting @var{continue-test} means the loop runs forever (or until
|
|
|
+something else causes exit from it). Statements inside the loop can
|
|
|
+test conditions for termination and use @samp{break;} to exit. This
|
|
|
+is more flexible since you can put those tests anywhere in the loop,
|
|
|
+not solely at the beginning.
|
|
|
+
|
|
|
+Putting an expression in @var{advance} is almost equivalent to writing
|
|
|
+it at the end of the loop body; it does almost the same thing. The
|
|
|
+only difference is for the @code{continue} statement (@pxref{continue
|
|
|
+Statement}). So we could have written this:
|
|
|
+
|
|
|
+@example
|
|
|
+for (i = 0; i < n;)
|
|
|
+ @{
|
|
|
+ @r{@dots{}}
|
|
|
+ ++i;
|
|
|
+ @}
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+instead of this:
|
|
|
+
|
|
|
+@example
|
|
|
+for (i = 0; i < n; ++i)
|
|
|
+ @{
|
|
|
+ @r{@dots{}}
|
|
|
+ @}
|
|
|
+@end example
|
|
|
+
|
|
|
+The choice is mainly a matter of what is more readable for
|
|
|
+programmers. However, there is also a syntactic difference:
|
|
|
+@var{advance} is an expression, not a statement. It can't include
|
|
|
+loops, blocks, declarations, etc.
|
|
|
+
|
|
|
+@node for-Index Declarations
|
|
|
+@subsection @code{for}-Index Declarations
|
|
|
+
|
|
|
+You can declare loop-index variables directly in the @var{start}
|
|
|
+portion of the @code{for}-loop, like this:
|
|
|
+
|
|
|
+@example
|
|
|
+for (int i = 0; i < n; ++i)
|
|
|
+ @{
|
|
|
+ @r{@dots{}}
|
|
|
+ @}
|
|
|
+@end example
|
|
|
+
|
|
|
+This kind of @var{start} is limited to a single declaration; it can
|
|
|
+declare one or more variables, separated by commas, all of which are
|
|
|
+the same @var{basetype} (@code{int}, in this example):
|
|
|
+
|
|
|
+@example
|
|
|
+for (int i = 0, j = 1, *p = NULL; i < n; ++i, ++j, ++p)
|
|
|
+ @{
|
|
|
+ @r{@dots{}}
|
|
|
+ @}
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+The scope of these variables is the @code{for} statement as a whole.
|
|
|
+See @ref{Variable Declarations} for a explanation of @var{basetype}.
|
|
|
+
|
|
|
+Variables declared in @code{for} statements should have initializers.
|
|
|
+Omitting the initialization gives the variables unpredictable initial
|
|
|
+values, so this code is erroneous.
|
|
|
+
|
|
|
+@example
|
|
|
+for (int i; i < n; ++i)
|
|
|
+ @{
|
|
|
+ @r{@dots{}}
|
|
|
+ @}
|
|
|
+@end example
|
|
|
+
|
|
|
+@node continue Statement
|
|
|
+@subsection @code{continue} Statement
|
|
|
+@cindex @code{continue} statement
|
|
|
+@cindex statement, @code{continue}
|
|
|
+@findex continue
|
|
|
+
|
|
|
+The @code{continue} statement looks like @samp{continue;}, and its
|
|
|
+effect is to jump immediately to the end of the innermost loop
|
|
|
+construct. If it is a @code{for}-loop, the next thing that happens
|
|
|
+is to execute the loop's @var{advance} expression.
|
|
|
+
|
|
|
+For example, this loop increments @code{p} until the next null character
|
|
|
+or newline, and operates (in some way not shown) on all the characters
|
|
|
+in the line except for spaces. All it does with spaces is skip them.
|
|
|
+
|
|
|
+@example
|
|
|
+for (;*p; ++p)
|
|
|
+ @{
|
|
|
+ /* @r{End loop if we have reached a newline.} */
|
|
|
+ if (*p == '\n')
|
|
|
+ break;
|
|
|
+ /* @r{Pay no attention to spaces.} */
|
|
|
+ if (*p == ' ')
|
|
|
+ continue;
|
|
|
+ /* @r{Operate on the next character.} */
|
|
|
+ @r{@dots{}}
|
|
|
+ @}
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+Executing @samp{continue;} skips the loop body but it does not
|
|
|
+skip the @var{advance} expression, @code{p++}.
|
|
|
+
|
|
|
+We could also write it like this:
|
|
|
+
|
|
|
+@example
|
|
|
+for (;*p; ++p)
|
|
|
+ @{
|
|
|
+ /* @r{Exit if we have reached a newline.} */
|
|
|
+ if (*p == '\n')
|
|
|
+ break;
|
|
|
+ /* @r{Pay no attention to spaces.} */
|
|
|
+ if (*p != ' ')
|
|
|
+ @{
|
|
|
+ /* @r{Operate on the next character.} */
|
|
|
+ @r{@dots{}}
|
|
|
+ @}
|
|
|
+ @}
|
|
|
+@end example
|
|
|
+
|
|
|
+The advantage of using @code{continue} is that it reduces the
|
|
|
+depth of nesting.
|
|
|
+
|
|
|
+Contrast @code{continue} with the @code{break} statement. @xref{break
|
|
|
+Statement}.
|
|
|
+
|
|
|
+@node switch Statement
|
|
|
+@section @code{switch} Statement
|
|
|
+@cindex @code{switch} statement
|
|
|
+@cindex statement, @code{switch}
|
|
|
+@findex switch
|
|
|
+@findex case
|
|
|
+@findex default
|
|
|
+
|
|
|
+The @code{switch} statement selects code to run according to the value
|
|
|
+of an expression. The expression, in parentheses, follows the keyword
|
|
|
+@code{switch}. After that come all the cases to select among,
|
|
|
+inside braces. It looks like this:
|
|
|
+
|
|
|
+@example
|
|
|
+switch (@var{selector})
|
|
|
+ @{
|
|
|
+ @var{cases}@r{@dots{}}
|
|
|
+ @}
|
|
|
+@end example
|
|
|
+
|
|
|
+A case can look like this:
|
|
|
+
|
|
|
+@example
|
|
|
+case @var{value}:
|
|
|
+ @var{statements}
|
|
|
+ break;
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+which means ``come here if @var{selector} happens to have the value
|
|
|
+@var{value},'' or like this (a GNU C extension):
|
|
|
+
|
|
|
+@example
|
|
|
+case @var{rangestart} ... @var{rangeend}:
|
|
|
+ @var{statements}
|
|
|
+ break;
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+which means ``come here if @var{selector} happens to have a value
|
|
|
+between @var{rangestart} and @var{rangeend} (inclusive).'' @xref{Case
|
|
|
+Ranges}.
|
|
|
+
|
|
|
+The values in @code{case} labels must reduce to integer constants.
|
|
|
+They can use arithmetic, and @code{enum} constants, but they cannot
|
|
|
+refer to data in memory, because they have to be computed at compile
|
|
|
+time. It is an error if two @code{case} labels specify the same
|
|
|
+value, or ranges that overlap, or if one is a range and the other is a
|
|
|
+value in that range.
|
|
|
+
|
|
|
+You can also define a default case to handle ``any other value,'' like
|
|
|
+this:
|
|
|
+
|
|
|
+@example
|
|
|
+default:
|
|
|
+ @var{statements}
|
|
|
+ break;
|
|
|
+@end example
|
|
|
+
|
|
|
+If the @code{switch} statement has no @code{default:} label, then it
|
|
|
+does nothing when the value matches none of the cases.
|
|
|
+
|
|
|
+The brace-group inside the @code{switch} statement is a block, and you
|
|
|
+can declare variables with that scope just as in any other block
|
|
|
+(@pxref{Blocks}). However, initializers in these declarations won't
|
|
|
+necessarily be executed every time the @code{switch} statement runs,
|
|
|
+so it is best to avoid giving them initializers.
|
|
|
+
|
|
|
+@code{break;} inside a @code{switch} statement exits immediately from
|
|
|
+the @code{switch} statement. @xref{break Statement}.
|
|
|
+
|
|
|
+If there is no @code{break;} at the end of the code for a case,
|
|
|
+execution continues into the code for the following case. This
|
|
|
+happens more often by mistake than intentionally, but since this
|
|
|
+feature is used in real code, we cannot eliminate it.
|
|
|
+
|
|
|
+@strong{Warning:} When one case is intended to fall through to the
|
|
|
+next, write a comment like @samp{falls through} to say it's
|
|
|
+intentional. That way, other programmers won't assume it was an error
|
|
|
+and ``fix'' it erroneously.
|
|
|
+
|
|
|
+Consecutive @code{case} statements could, pedantically, be considered
|
|
|
+an instance of falling through, but we don't consider or treat them that
|
|
|
+way because they won't confuse anyone.
|
|
|
+
|
|
|
+@node switch Example
|
|
|
+@section Example of @code{switch}
|
|
|
+
|
|
|
+Here's an example of using the @code{switch} statement
|
|
|
+to distinguish among characters:
|
|
|
+
|
|
|
+@cindex counting vowels and punctuation
|
|
|
+@example
|
|
|
+struct vp @{ int vowels, punct; @};
|
|
|
+
|
|
|
+struct vp
|
|
|
+count_vowels_and_punct (char *string)
|
|
|
+@{
|
|
|
+ int c;
|
|
|
+ int vowels = 0;
|
|
|
+ int punct = 0;
|
|
|
+ /* @r{Don't change the parameter itself.} */
|
|
|
+ /* @r{That helps in debugging.} */
|
|
|
+ char *p = string;
|
|
|
+ struct vp value;
|
|
|
+
|
|
|
+ while (c = *p++)
|
|
|
+ switch (c)
|
|
|
+ @{
|
|
|
+ case 'y':
|
|
|
+ case 'Y':
|
|
|
+ /* @r{We assume @code{y_is_consonant} will check surrounding
|
|
|
+ letters to determine whether this y is a vowel.} */
|
|
|
+ if (y_is_consonant (p - 1))
|
|
|
+ break;
|
|
|
+
|
|
|
+ /* @r{Falls through} */
|
|
|
+
|
|
|
+ case 'a':
|
|
|
+ case 'e':
|
|
|
+ case 'i':
|
|
|
+ case 'o':
|
|
|
+ case 'u':
|
|
|
+ case 'A':
|
|
|
+ case 'E':
|
|
|
+ case 'I':
|
|
|
+ case 'O':
|
|
|
+ case 'U':
|
|
|
+ vowels++;
|
|
|
+ break;
|
|
|
+
|
|
|
+ case '.':
|
|
|
+ case ',':
|
|
|
+ case ':':
|
|
|
+ case ';':
|
|
|
+ case '?':
|
|
|
+ case '!':
|
|
|
+ case '\"':
|
|
|
+ case '\'':
|
|
|
+ punct++;
|
|
|
+ break;
|
|
|
+ @}
|
|
|
+
|
|
|
+ value.vowels = vowels;
|
|
|
+ value.punct = punct;
|
|
|
+
|
|
|
+ return value;
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+@node Duffs Device
|
|
|
+@section Duff's Device
|
|
|
+@cindex Duff's device
|
|
|
+
|
|
|
+The cases in a @code{switch} statement can be inside other control
|
|
|
+constructs. For instance, we can use a technique known as @dfn{Duff's
|
|
|
+device} to optimize this simple function,
|
|
|
+
|
|
|
+@example
|
|
|
+void
|
|
|
+copy (char *to, char *from, int count)
|
|
|
+@{
|
|
|
+ while (count > 0)
|
|
|
+ *to++ = *from++, count--;
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+which copies memory starting at @var{from} to memory starting at
|
|
|
+@var{to}.
|
|
|
+
|
|
|
+Duff's device involves unrolling the loop so that it copies
|
|
|
+several characters each time around, and using a @code{switch} statement
|
|
|
+to enter the loop body at the proper point:
|
|
|
+
|
|
|
+@example
|
|
|
+void
|
|
|
+copy (char *to, char *from, int count)
|
|
|
+@{
|
|
|
+ if (count <= 0)
|
|
|
+ return;
|
|
|
+ int n = (count + 7) / 8;
|
|
|
+ switch (count % 8)
|
|
|
+ @{
|
|
|
+ do @{
|
|
|
+ case 0: *to++ = *from++;
|
|
|
+ case 7: *to++ = *from++;
|
|
|
+ case 6: *to++ = *from++;
|
|
|
+ case 5: *to++ = *from++;
|
|
|
+ case 4: *to++ = *from++;
|
|
|
+ case 3: *to++ = *from++;
|
|
|
+ case 2: *to++ = *from++;
|
|
|
+ case 1: *to++ = *from++;
|
|
|
+ @} while (--n > 0);
|
|
|
+ @}
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+@node Case Ranges
|
|
|
+@section Case Ranges
|
|
|
+@cindex case ranges
|
|
|
+@cindex ranges in case statements
|
|
|
+
|
|
|
+You can specify a range of consecutive values in a single @code{case} label,
|
|
|
+like this:
|
|
|
+
|
|
|
+@example
|
|
|
+case @var{low} ... @var{high}:
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+This has the same effect as the proper number of individual @code{case}
|
|
|
+labels, one for each integer value from @var{low} to @var{high}, inclusive.
|
|
|
+
|
|
|
+This feature is especially useful for ranges of ASCII character codes:
|
|
|
+
|
|
|
+@example
|
|
|
+case 'A' ... 'Z':
|
|
|
+@end example
|
|
|
+
|
|
|
+@strong{Be careful:} with integers, write spaces around the @code{...}
|
|
|
+to prevent it from being parsed wrong. For example, write this:
|
|
|
+
|
|
|
+@example
|
|
|
+case 1 ... 5:
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+rather than this:
|
|
|
+
|
|
|
+@example
|
|
|
+case 1...5:
|
|
|
+@end example
|
|
|
+
|
|
|
+@node Null Statement
|
|
|
+@section Null Statement
|
|
|
+@cindex null statement
|
|
|
+@cindex statement, null
|
|
|
+
|
|
|
+A @dfn{null statement} is just a semicolon. It does nothing.
|
|
|
+
|
|
|
+A null statement is a placeholder for use where a statement is
|
|
|
+grammatically required, but there is nothing to be done. For
|
|
|
+instance, sometimes all the work of a @code{for}-loop is done in the
|
|
|
+@code{for}-header itself, leaving no work for the body. Here is an
|
|
|
+example that searches for the first newline in @code{array}:
|
|
|
+
|
|
|
+@example
|
|
|
+for (p = array; *p != '\n'; p++)
|
|
|
+ ;
|
|
|
+@end example
|
|
|
+
|
|
|
+@node goto Statement
|
|
|
+@section @code{goto} Statement and Labels
|
|
|
+@cindex @code{goto} statement
|
|
|
+@cindex statement, @code{goto}
|
|
|
+@cindex label
|
|
|
+@findex goto
|
|
|
+
|
|
|
+The @code{goto} statement looks like this:
|
|
|
+
|
|
|
+@example
|
|
|
+goto @var{label};
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+Its effect is to transfer control immediately to another part of the
|
|
|
+current function---where the label named @var{label} is defined.
|
|
|
+
|
|
|
+An ordinary label definition looks like this:
|
|
|
+
|
|
|
+@example
|
|
|
+@var{label}:
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+and it can appear before any statement. You can't use @code{default}
|
|
|
+as a label, since that has a special meaning for @code{switch}
|
|
|
+statements.
|
|
|
+
|
|
|
+An ordinary label doesn't need a separate declaration; defining it is
|
|
|
+enough.
|
|
|
+
|
|
|
+Here's an example of using @code{goto} to implement a loop
|
|
|
+equivalent to @code{do}--@code{while}:
|
|
|
+
|
|
|
+@example
|
|
|
+@{
|
|
|
+ loop_restart:
|
|
|
+ @var{body}
|
|
|
+ if (@var{condition})
|
|
|
+ goto loop_restart;
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+The name space of labels is separate from that of variables and functions.
|
|
|
+Thus, there is no error in using a single name in both ways:
|
|
|
+
|
|
|
+@example
|
|
|
+@{
|
|
|
+ int foo; // @r{Variable @code{foo}.}
|
|
|
+ foo: // @r{Label @code{foo}.}
|
|
|
+ @var{body}
|
|
|
+ if (foo > 0) // @r{Variable @code{foo}.}
|
|
|
+ goto foo; // @r{Label @code{foo}.}
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+Blocks have no effect on ordinary labels; each label name is defined
|
|
|
+throughout the whole of the function it appears in. It looks strange to
|
|
|
+jump into a block with @code{goto}, but it works. For example,
|
|
|
+
|
|
|
+@example
|
|
|
+if (x < 0)
|
|
|
+ goto negative;
|
|
|
+if (y < 0)
|
|
|
+ @{
|
|
|
+ negative:
|
|
|
+ printf ("Negative\n");
|
|
|
+ return;
|
|
|
+ @}
|
|
|
+@end example
|
|
|
+
|
|
|
+If the goto jumps into the scope of a variable, it does not
|
|
|
+initialize the variable. For example, if @code{x} is negative,
|
|
|
+
|
|
|
+@example
|
|
|
+if (x < 0)
|
|
|
+ goto negative;
|
|
|
+if (y < 0)
|
|
|
+ @{
|
|
|
+ int i = 5;
|
|
|
+ negative:
|
|
|
+ printf ("Negative, and i is %d\n", i);
|
|
|
+ return;
|
|
|
+ @}
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+prints junk because @code{i} was not initialized.
|
|
|
+
|
|
|
+If the block declares a variable-length automatic array, jumping into
|
|
|
+it gives a compilation error. However, jumping out of the scope of a
|
|
|
+variable-length array works fine, and deallocates its storage.
|
|
|
+
|
|
|
+A label can't come directly before a declaration, so the code can't
|
|
|
+jump directly to one. For example, this is not allowed:
|
|
|
+
|
|
|
+@example
|
|
|
+@{
|
|
|
+ goto foo;
|
|
|
+foo:
|
|
|
+ int x = 5;
|
|
|
+ bar(&x);
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+The workaround is to add a statement, even an empty statement,
|
|
|
+directly after the label. For example:
|
|
|
+
|
|
|
+@example
|
|
|
+@{
|
|
|
+ goto foo;
|
|
|
+foo:
|
|
|
+ ;
|
|
|
+ int x = 5;
|
|
|
+ bar(&x);
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+Likewise, a label can't be the last thing in a block. The workaround
|
|
|
+solution is the same: add a semicolon after the label.
|
|
|
+
|
|
|
+These unnecessary restrictions on labels make no sense, and ought in
|
|
|
+principle to be removed; but they do only a little harm since labels
|
|
|
+and @code{goto} are rarely the best way to write a program.
|
|
|
+
|
|
|
+These examples are all artificial; it would be more natural to
|
|
|
+write them in other ways, without @code{goto}. For instance,
|
|
|
+the clean way to write the example that prints @samp{Negative} is this:
|
|
|
+
|
|
|
+@example
|
|
|
+if (x < 0 || y < 0)
|
|
|
+ @{
|
|
|
+ printf ("Negative\n");
|
|
|
+ return;
|
|
|
+ @}
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+It is hard to construct simple examples where @code{goto} is actually
|
|
|
+the best way to write a program. Its rare good uses tend to be in
|
|
|
+complex code, thus not apt for the purpose of explaining the meaning
|
|
|
+of @code{goto}.
|
|
|
+
|
|
|
+The only good time to use @code{goto} is when it makes the code
|
|
|
+simpler than any alternative. Jumping backward is rarely desirable,
|
|
|
+because usually the other looping and control constructs give simpler
|
|
|
+code. Using @code{goto} to jump forward is more often desirable, for
|
|
|
+instance when a function needs to do some processing in an error case
|
|
|
+and errors can occur at various different places within the function.
|
|
|
+
|
|
|
+@node Local Labels
|
|
|
+@section Locally Declared Labels
|
|
|
+@cindex local labels
|
|
|
+@cindex macros, local labels
|
|
|
+@findex __label__
|
|
|
+
|
|
|
+In GNU C you can declare @dfn{local labels} in any nested block
|
|
|
+scope. A local label is used in a @code{goto} statement just like an
|
|
|
+ordinary label, but you can only reference it within the block in
|
|
|
+which it was declared.
|
|
|
+
|
|
|
+A local label declaration looks like this:
|
|
|
+
|
|
|
+@example
|
|
|
+__label__ @var{label};
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+or
|
|
|
+
|
|
|
+@example
|
|
|
+__label__ @var{label1}, @var{label2}, @r{@dots{}};
|
|
|
+@end example
|
|
|
+
|
|
|
+Local label declarations must come at the beginning of the block,
|
|
|
+before any ordinary declarations or statements.
|
|
|
+
|
|
|
+The label declaration declares the label @emph{name}, but does not define
|
|
|
+the label itself. That's done in the usual way, with
|
|
|
+@code{@var{label}:}, before one of the statements in the block.
|
|
|
+
|
|
|
+The local label feature is useful for complex macros. If a macro
|
|
|
+contains nested loops, a @code{goto} can be useful for breaking out of
|
|
|
+them. However, an ordinary label whose scope is the whole function
|
|
|
+cannot be used: if the macro can be expanded several times in one
|
|
|
+function, the label will be multiply defined in that function. A
|
|
|
+local label avoids this problem. For example:
|
|
|
+
|
|
|
+@example
|
|
|
+#define SEARCH(value, array, target) \
|
|
|
+do @{ \
|
|
|
+ __label__ found; \
|
|
|
+ __auto_type _SEARCH_target = (target); \
|
|
|
+ __auto_type _SEARCH_array = (array); \
|
|
|
+ int i, j; \
|
|
|
+ int value; \
|
|
|
+ for (i = 0; i < max; i++) \
|
|
|
+ for (j = 0; j < max; j++) \
|
|
|
+ if (_SEARCH_array[i][j] == _SEARCH_target) \
|
|
|
+ @{ (value) = i; goto found; @} \
|
|
|
+ (value) = -1; \
|
|
|
+ found:; \
|
|
|
+@} while (0)
|
|
|
+@end example
|
|
|
+
|
|
|
+This could also be written using a statement expression
|
|
|
+(@pxref{Statement Exprs}):
|
|
|
+
|
|
|
+@example
|
|
|
+#define SEARCH(array, target) \
|
|
|
+(@{ \
|
|
|
+ __label__ found; \
|
|
|
+ __auto_type _SEARCH_target = (target); \
|
|
|
+ __auto_type _SEARCH_array = (array); \
|
|
|
+ int i, j; \
|
|
|
+ int value; \
|
|
|
+ for (i = 0; i < max; i++) \
|
|
|
+ for (j = 0; j < max; j++) \
|
|
|
+ if (_SEARCH_array[i][j] == _SEARCH_target) \
|
|
|
+ @{ value = i; goto found; @} \
|
|
|
+ value = -1; \
|
|
|
+ found: \
|
|
|
+ value; \
|
|
|
+@})
|
|
|
+@end example
|
|
|
+
|
|
|
+Ordinary labels are visible throughout the function where they are
|
|
|
+defined, and only in that function. However, explicitly declared
|
|
|
+local labels of a block are visible in nested functions declared
|
|
|
+within that block. @xref{Nested Functions}, for details.
|
|
|
+
|
|
|
+@xref{goto Statement}.
|
|
|
+
|
|
|
+@node Labels as Values
|
|
|
+@section Labels as Values
|
|
|
+@cindex labels as values
|
|
|
+@cindex computed gotos
|
|
|
+@cindex goto with computed label
|
|
|
+@cindex address of a label
|
|
|
+
|
|
|
+In GNU C, you can get the address of a label defined in the current
|
|
|
+function (or a local label defined in the containing function) with
|
|
|
+the unary operator @samp{&&}. The value has type @code{void *}. This
|
|
|
+value is a constant and can be used wherever a constant of that type
|
|
|
+is valid. For example:
|
|
|
+
|
|
|
+@example
|
|
|
+void *ptr;
|
|
|
+@r{@dots{}}
|
|
|
+ptr = &&foo;
|
|
|
+@end example
|
|
|
+
|
|
|
+To use these values requires a way to jump to one. This is done
|
|
|
+with the computed goto statement@footnote{The analogous feature in
|
|
|
+Fortran is called an assigned goto, but that name seems inappropriate in
|
|
|
+C, since you can do more with label addresses than store them in special label
|
|
|
+variables.}, @code{goto *@var{exp};}. For example,
|
|
|
+
|
|
|
+@example
|
|
|
+goto *ptr;
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+Any expression of type @code{void *} is allowed.
|
|
|
+
|
|
|
+@xref{goto Statement}.
|
|
|
+
|
|
|
+@menu
|
|
|
+* Label Value Uses:: Examples of using label values.
|
|
|
+* Label Value Caveats:: Limitations of label values.
|
|
|
+@end menu
|
|
|
+
|
|
|
+@node Label Value Uses
|
|
|
+@subsection Label Value Uses
|
|
|
+
|
|
|
+One use for label-valued constants is to initialize a static array to
|
|
|
+serve as a jump table:
|
|
|
+
|
|
|
+@example
|
|
|
+static void *array[] = @{ &&foo, &&bar, &&hack @};
|
|
|
+@end example
|
|
|
+
|
|
|
+Then you can select a label with indexing, like this:
|
|
|
+
|
|
|
+@example
|
|
|
+goto *array[i];
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+Note that this does not check whether the subscript is in bounds---array
|
|
|
+indexing in C never checks that.
|
|
|
+
|
|
|
+You can make the table entries offsets instead of addresses
|
|
|
+by subtracting one label from the others. Here is an example:
|
|
|
+
|
|
|
+@example
|
|
|
+static const int array[] = @{ &&foo - &&foo, &&bar - &&foo,
|
|
|
+ &&hack - &&foo @};
|
|
|
+goto *(&&foo + array[i]);
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+Using offsets is preferable in shared libraries, as it avoids the need
|
|
|
+for dynamic relocation of the array elements; therefore, the array can
|
|
|
+be read-only.
|
|
|
+
|
|
|
+An array of label values or offsets serves a purpose much like that of
|
|
|
+the @code{switch} statement. The @code{switch} statement is cleaner,
|
|
|
+so use @code{switch} by preference when feasible.
|
|
|
+
|
|
|
+Another use of label values is in an interpreter for threaded code.
|
|
|
+The labels within the interpreter function can be stored in the
|
|
|
+threaded code for super-fast dispatching.
|
|
|
+
|
|
|
+@node Label Value Caveats
|
|
|
+@subsection Label Value Caveats
|
|
|
+
|
|
|
+Jumping to a label defined in another function does not work.
|
|
|
+It can cause unpredictable results.
|
|
|
+
|
|
|
+The best way to avoid this is to store label values only in
|
|
|
+automatic variables, or static variables whose names are declared
|
|
|
+within the function. Never pass them as arguments.
|
|
|
+
|
|
|
+@cindex cloning
|
|
|
+An optimization known as @dfn{cloning} generates multiple simplified
|
|
|
+variants of a function's code, for use with specific fixed arguments.
|
|
|
+Using label values in certain ways, such as saving the address in one
|
|
|
+call to the function and using it again in another call, would make cloning
|
|
|
+give incorrect results. These functions must disable cloning.
|
|
|
+
|
|
|
+Inlining calls to the function would also result in multiple copies of
|
|
|
+the code, each with its own value of the same label. Using the label
|
|
|
+in a computed goto is no problem, because the computed goto inhibits
|
|
|
+inlining. However, using the label value in some other way, such as
|
|
|
+an indication of where an error occurred, would be optimized wrong.
|
|
|
+These functions must disable inlining.
|
|
|
+
|
|
|
+To prevent inlining or cloning of a function, specify
|
|
|
+@code{__attribute__((__noinline__,__noclone__))} in its definition.
|
|
|
+@xref{Attributes}.
|
|
|
+
|
|
|
+When a function uses a label value in a static variable initializer,
|
|
|
+that automatically prevents inlining or cloning the function.
|
|
|
+
|
|
|
+@node Statement Exprs
|
|
|
+@section Statements and Declarations in Expressions
|
|
|
+@cindex statements inside expressions
|
|
|
+@cindex declarations inside expressions
|
|
|
+@cindex expressions containing statements
|
|
|
+
|
|
|
+@c the above section title wrapped and causes an underfull hbox.. i
|
|
|
+@c changed it from "within" to "in". --mew 4feb93
|
|
|
+A block enclosed in parentheses can be used as an expression in GNU
|
|
|
+C@. This provides a way to use local variables, loops and switches within
|
|
|
+an expression. We call it a @dfn{statement expression}.
|
|
|
+
|
|
|
+Recall that a block is a sequence of statements
|
|
|
+surrounded by braces. In this construct, parentheses go around the
|
|
|
+braces. For example:
|
|
|
+
|
|
|
+@example
|
|
|
+(@{ int y = foo (); int z;
|
|
|
+ if (y > 0) z = y;
|
|
|
+ else z = - y;
|
|
|
+ z; @})
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+is a valid (though slightly more complex than necessary) expression
|
|
|
+for the absolute value of @code{foo ()}.
|
|
|
+
|
|
|
+The last statement in the block should be an expression statement; an
|
|
|
+expression followed by a semicolon, that is. The value of this
|
|
|
+expression serves as the value of statement expression. If the last
|
|
|
+statement is anything else, the statement expression's value is
|
|
|
+@code{void}.
|
|
|
+
|
|
|
+This feature is mainly useful in making macro definitions compute each
|
|
|
+operand exactly once. @xref{Macros and Auto Type}.
|
|
|
+
|
|
|
+Statement expressions are not allowed in expressions that must be
|
|
|
+constant, such as the value for an enumerator, the width of a
|
|
|
+bit-field, or the initial value of a static variable.
|
|
|
+
|
|
|
+Jumping into a statement expression---with @code{goto}, or using a
|
|
|
+@code{switch} statement outside the statement expression---is an
|
|
|
+error. With a computed @code{goto} (@pxref{Labels as Values}), the
|
|
|
+compiler can't detect the error, but it still won't work.
|
|
|
+
|
|
|
+Jumping out of a statement expression is permitted, but since
|
|
|
+subexpressions in C are not computed in a strict order, it is
|
|
|
+unpredictable which other subexpressions will have been computed by
|
|
|
+then. For example,
|
|
|
+
|
|
|
+@example
|
|
|
+ foo (), ((@{ bar1 (); goto a; 0; @}) + bar2 ()), baz();
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+calls @code{foo} and @code{bar1} before it jumps, and never
|
|
|
+calls @code{baz}, but may or may not call @code{bar2}. If @code{bar2}
|
|
|
+does get called, that occurs after @code{foo} and before @code{bar1}.
|
|
|
+
|
|
|
+@node Variables
|
|
|
+@chapter Variables
|
|
|
+@cindex variables
|
|
|
+
|
|
|
+Every variable used in a C program needs to be made known by a
|
|
|
+@dfn{declaration}. It can be used only after it has been declared.
|
|
|
+It is an error to declare a variable name more than once in the same
|
|
|
+scope; an exception is that @code{extern} declarations and tentative
|
|
|
+definitions can coexist with another declaration of the same
|
|
|
+variable.
|
|
|
+
|
|
|
+Variables can be declared anywhere within a block or file. (Older
|
|
|
+versions of C required that all variable declarations within a block
|
|
|
+occur before any statements.)
|
|
|
+
|
|
|
+Variables declared within a function or block are @dfn{local} to
|
|
|
+it. This means that the variable name is visible only until the end
|
|
|
+of that function or block, and the memory space is allocated only
|
|
|
+while control is within it.
|
|
|
+
|
|
|
+Variables declared at the top level in a file are called @dfn{file-scope}.
|
|
|
+They are assigned fixed, distinct memory locations, so they retain
|
|
|
+their values for the whole execution of the program.
|
|
|
+
|
|
|
+@menu
|
|
|
+* Variable Declarations:: Name a variable and and reserve space for it.
|
|
|
+* Initializers:: Assigning inital values to variables.
|
|
|
+* Designated Inits:: Assigning initial values to array elements
|
|
|
+ at particular array indices.
|
|
|
+* Auto Type:: Obtaining the type of a variable.
|
|
|
+* Local Variables:: Variables declared in function definitions.
|
|
|
+* File-Scope Variables:: Variables declared outside of
|
|
|
+ function definitions.
|
|
|
+* Static Local Variables:: Variables declared within functions,
|
|
|
+ but with permanent storage allocation.
|
|
|
+* Extern Declarations:: Declaring a variable
|
|
|
+ which is allocated somewhere else.
|
|
|
+* Allocating File-Scope:: When is space allocated
|
|
|
+ for file-scope variables?
|
|
|
+* auto and register:: Historically used storage directions.
|
|
|
+* Omitting Types:: The bad practice of declaring variables
|
|
|
+ with implicit type.
|
|
|
+@end menu
|
|
|
+
|
|
|
+@node Variable Declarations
|
|
|
+@section Variable Declarations
|
|
|
+@cindex variable declarations
|
|
|
+@cindex declaration of variables
|
|
|
+
|
|
|
+Here's what a variable declaration looks like:
|
|
|
+
|
|
|
+@example
|
|
|
+@var{keywords} @var{basetype} @var{decorated-variable} @r{[}= @var{init}@r{]};
|
|
|
+@end example
|
|
|
+
|
|
|
+The @var{keywords} specify how to handle the scope of the variable
|
|
|
+name and the allocation of its storage. Most declarations have
|
|
|
+no keywords because the defaults are right for them.
|
|
|
+
|
|
|
+C allows these keywords to come before or after @var{basetype}, or
|
|
|
+even in the middle of it as in @code{unsigned static int}, but don't
|
|
|
+do that---it would surprise other programmers. Always write the
|
|
|
+keywords first.
|
|
|
+
|
|
|
+The @var{basetype} can be any of the predefined types of C, or a type
|
|
|
+keyword defined with @code{typedef}. It can also be @code{struct
|
|
|
+@var{tag}}, @code{union @var{tag}}, or @code{enum @var{tag}}. In
|
|
|
+addition, it can include type qualifiers such as @code{const} and
|
|
|
+@code{volatile} (@pxref{Type Qualifiers}).
|
|
|
+
|
|
|
+In the simplest case, @var{decorated-variable} is just the variable
|
|
|
+name. That declares the variable with the type specified by
|
|
|
+@var{basetype}. For instance,
|
|
|
+
|
|
|
+@example
|
|
|
+int foo;
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+uses @code{int} as the @var{basetype} and @code{foo} as the
|
|
|
+@var{decorated-variable}. It declares @code{foo} with type
|
|
|
+@code{int}.
|
|
|
+
|
|
|
+@example
|
|
|
+struct tree_node foo;
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+declares @code{foo} with type @code{struct tree_node}.
|
|
|
+
|
|
|
+@menu
|
|
|
+* Declaring Arrays and Pointers:: Declaration syntax for variables of
|
|
|
+ array and pointer types.
|
|
|
+* Combining Variable Declarations:: More than one variable declaration
|
|
|
+ in a single statement.
|
|
|
+@end menu
|
|
|
+
|
|
|
+@node Declaring Arrays and Pointers
|
|
|
+@subsection Declaring Arrays and Pointers
|
|
|
+@cindex declaring arrays and pointers
|
|
|
+@cindex array, declaring
|
|
|
+@cindex pointers, declaring
|
|
|
+
|
|
|
+To declare a variable that is an array, write
|
|
|
+@code{@var{variable}[@var{length}]} for @var{decorated-variable}:
|
|
|
+
|
|
|
+@example
|
|
|
+int foo[5];
|
|
|
+@end example
|
|
|
+
|
|
|
+To declare a variable that has a pointer type, write
|
|
|
+@code{*@var{variable}} for @var{decorated-variable}:
|
|
|
+
|
|
|
+@example
|
|
|
+struct list_elt *foo;
|
|
|
+@end example
|
|
|
+
|
|
|
+These constructs nest. For instance,
|
|
|
+
|
|
|
+@example
|
|
|
+int foo[3][5];
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+declares @code{foo} as an array of 3 arrays of 5 integers each,
|
|
|
+
|
|
|
+@example
|
|
|
+struct list_elt *foo[5];
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+declares @code{foo} as an array of 5 pointers to structures, and
|
|
|
+
|
|
|
+@example
|
|
|
+struct list_elt **foo;
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+declares @code{foo} as a pointer to a pointer to a structure.
|
|
|
+
|
|
|
+@example
|
|
|
+int **(*foo[30])(int, double);
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+declares @code{foo} as an array of 30 pointers to functions
|
|
|
+(@pxref{Function Pointers}), each of which must accept two arguments
|
|
|
+(one @code{int} and one @code{double}) and return type @code{int **}.
|
|
|
+
|
|
|
+@example
|
|
|
+void
|
|
|
+bar (int size)
|
|
|
+@{
|
|
|
+ int foo[size];
|
|
|
+ @r{@dots{}}
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+declares @code{foo} as an array of integers with a size specified at
|
|
|
+run time when the function @code{bar} is called.
|
|
|
+
|
|
|
+@node Combining Variable Declarations
|
|
|
+@subsection Combining Variable Declarations
|
|
|
+@cindex combining variable declarations
|
|
|
+@cindex variable declarations, combining
|
|
|
+@cindex declarations, combining
|
|
|
+
|
|
|
+When multiple declarations have the same @var{keywords} and
|
|
|
+@var{basetype}, you can combine them using commas. Thus,
|
|
|
+
|
|
|
+@example
|
|
|
+@var{keywords} @var{basetype}
|
|
|
+ @var{decorated-variable-1} @r{[}= @var{init1}@r{]},
|
|
|
+ @var{decorated-variable-2} @r{[}= @var{init2}@r{]};
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+is equivalent to
|
|
|
+
|
|
|
+@example
|
|
|
+@var{keywords} @var{basetype}
|
|
|
+ @var{decorated-variable-1} @r{[}= @var{init1}@r{]};
|
|
|
+@var{keywords} @var{basetype}
|
|
|
+ @var{decorated-variable-2} @r{[}= @var{init2}@r{]};
|
|
|
+@end example
|
|
|
+
|
|
|
+Here are some simple examples:
|
|
|
+
|
|
|
+@example
|
|
|
+int a, b;
|
|
|
+int a = 1, b = 2;
|
|
|
+int a, *p, array[5];
|
|
|
+int a = 0, *p = &a, array[5] = @{1, 2@};
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+In the last two examples, @code{a} is an @code{int}, @code{p} is a
|
|
|
+pointer to @code{int}, and @code{array} is an array of 5 @code{int}s.
|
|
|
+Since the initializer for @code{array} specifies only two elements,
|
|
|
+the other three elements are initialized to zero.
|
|
|
+
|
|
|
+@node Initializers
|
|
|
+@section Initializers
|
|
|
+@cindex initializers
|
|
|
+
|
|
|
+A variable's declaration, unless it is @code{extern}, should also
|
|
|
+specify its initial value. For numeric and pointer-type variables,
|
|
|
+the initializer is an expression for the value. If necessary, it is
|
|
|
+converted to the variable's type, just as in an assignment.
|
|
|
+
|
|
|
+You can also initialize a local structure-type (@pxref{Structures}) or
|
|
|
+local union-type (@pxref{Unions}) variable this way, from an
|
|
|
+expression whose value has the same type. But you can't initialize an
|
|
|
+array this way (@pxref{Arrays}), since arrays are not first-class
|
|
|
+objects in C (@pxref{Limitations of C Arrays}) and there is no array
|
|
|
+assignment.
|
|
|
+
|
|
|
+You can initialize arrays and structures componentwise,
|
|
|
+with a list of the elements or components. You can initialize
|
|
|
+a union with any one of its alternatives.
|
|
|
+
|
|
|
+@itemize @bullet
|
|
|
+@item
|
|
|
+A component-wise initializer for an array consists of element values
|
|
|
+surrounded by @samp{@{@r{@dots{}}@}}. If the values in the initializer
|
|
|
+don't cover all the elements in the array, the remaining elements are
|
|
|
+initialized to zero.
|
|
|
+
|
|
|
+You can omit the size of the array when you declare it, and let
|
|
|
+the initializer specify the size:
|
|
|
+
|
|
|
+@example
|
|
|
+int array[] = @{ 3, 9, 12 @};
|
|
|
+@end example
|
|
|
+
|
|
|
+@item
|
|
|
+A component-wise initializer for a structure consists of field values
|
|
|
+surrounded by @samp{@{@r{@dots{}}@}}. Write the field values in the same
|
|
|
+order as the fields are declared in the structure. If the values in
|
|
|
+the initializer don't cover all the fields in the structure, the
|
|
|
+remaining fields are initialized to zero.
|
|
|
+
|
|
|
+@item
|
|
|
+The initializer for a union-type variable has the form @code{@{
|
|
|
+@var{value} @}}, where @var{value} initializes the @emph{first alternative}
|
|
|
+in the union definition.
|
|
|
+@end itemize
|
|
|
+
|
|
|
+For an array of arrays, a structure containing arrays, an array of
|
|
|
+structures, etc., you can nest these constructs. For example,
|
|
|
+
|
|
|
+@example
|
|
|
+struct point @{ double x, y; @};
|
|
|
+
|
|
|
+struct point series[]
|
|
|
+ = @{ @{0, 0@}, @{1.5, 2.8@}, @{99, 100.0004@} @};
|
|
|
+@end example
|
|
|
+
|
|
|
+You can omit a pair of inner braces if they contain the right
|
|
|
+number of elements for the sub-value they initialize, so that
|
|
|
+no elements or fields need to be filled in with zeros.
|
|
|
+But don't do that very much, as it gets confusing.
|
|
|
+
|
|
|
+An array of @code{char} can be initialized using a string constant.
|
|
|
+Recall that the string constant includes an implicit null character at
|
|
|
+the end (@pxref{String Constants}). Using a string constant as
|
|
|
+initializer means to use its contents as the initial values of the
|
|
|
+array elements. Here are examples:
|
|
|
+
|
|
|
+@example
|
|
|
+char text[6] = "text!"; /* @r{Includes the null.} */
|
|
|
+char text[5] = "text!"; /* @r{Excludes the null.} */
|
|
|
+char text[] = "text!"; /* @r{Gets length 6.} */
|
|
|
+char text[]
|
|
|
+ = @{ 't', 'e', 'x', 't', '!', 0 @}; /* @r{same as above.} */
|
|
|
+char text[] = @{ "text!" @}; /* @r{Braces are optional.} */
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+and this kind of initializer can be nested inside braces to initialize
|
|
|
+structures or arrays that contain a @code{char}-array.
|
|
|
+
|
|
|
+In like manner, you can use a wide string constant to initialize
|
|
|
+an array of @code{wchar_t}.
|
|
|
+
|
|
|
+@node Designated Inits
|
|
|
+@section Designated Initializers
|
|
|
+@cindex initializers with labeled elements
|
|
|
+@cindex labeled elements in initializers
|
|
|
+@cindex case labels in initializers
|
|
|
+@cindex designated initializers
|
|
|
+
|
|
|
+In a complex structure or long array, it's useful to indicate
|
|
|
+which field or element we are initializing.
|
|
|
+
|
|
|
+To designate specific array elements during initialization, include
|
|
|
+the array index in brackets, and an assignment operator, for each
|
|
|
+element:
|
|
|
+
|
|
|
+@example
|
|
|
+int foo[10] = @{ [3] = 42, [7] = 58 @};
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+This does the same thing as:
|
|
|
+
|
|
|
+@example
|
|
|
+int foo[10] = @{ 0, 0, 0, 42, 0, 0, 0, 58, 0, 0 @};
|
|
|
+@end example
|
|
|
+
|
|
|
+The array initialization can include non-designated element values
|
|
|
+alongside designated indices; these follow the expected ordering
|
|
|
+of the array initialization, so that
|
|
|
+
|
|
|
+@example
|
|
|
+int foo[10] = @{ [3] = 42, 43, 44, [7] = 58 @};
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+does the same thing as:
|
|
|
+
|
|
|
+@example
|
|
|
+int foo[10] = @{ 0, 0, 0, 42, 43, 44, 0, 58, 0, 0 @};
|
|
|
+@end example
|
|
|
+
|
|
|
+Note that you can only use constant expressions as array index values,
|
|
|
+not variables.
|
|
|
+
|
|
|
+If you need to initialize a subsequence of sequential array elements to
|
|
|
+the same value, you can specify a range:
|
|
|
+
|
|
|
+@example
|
|
|
+int foo[100] = @{ [0 ... 19] = 42, [20 ... 99] = 43 @};
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+Using a range this way is a GNU C extension.
|
|
|
+
|
|
|
+When subsequence ranges overlap, each element is initialized by the
|
|
|
+last specification that applies to it. Thus, this initialization is
|
|
|
+equivalent to the previous one.
|
|
|
+
|
|
|
+@example
|
|
|
+int foo[100] = @{ [0 ... 99] = 43, [0 ... 19] = 42 @};
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+as the second overrides the first for elements 0 through 19.
|
|
|
+
|
|
|
+The value used to initialize a range of elements is evaluated only
|
|
|
+once, for the first element in the range. So for example, this code
|
|
|
+
|
|
|
+@example
|
|
|
+int random_values[100]
|
|
|
+ = @{ [0 ... 99] = get_random_number() @};
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+would initialize all 100 elements of the array @code{random_values} to
|
|
|
+the same value---probably not what is intended.
|
|
|
+
|
|
|
+Similarly, you can initialize specific fields of a structure variable
|
|
|
+by specifying the field name prefixed with a dot:
|
|
|
+
|
|
|
+@example
|
|
|
+struct point @{ int x; int y; @};
|
|
|
+
|
|
|
+struct point foo = @{ .y = 42; @};
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+The same syntax works for union variables as well:
|
|
|
+
|
|
|
+@example
|
|
|
+union int_double @{ int i; double d; @};
|
|
|
+
|
|
|
+union int_double foo = @{ .d = 34 @};
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+This casts the integer value 34 to a double and stores it
|
|
|
+in the union variable @code{foo}.
|
|
|
+
|
|
|
+You can designate both array elements and structure elements in
|
|
|
+the same initialization; for example, here's an array of point
|
|
|
+structures:
|
|
|
+
|
|
|
+@example
|
|
|
+struct point point_array[10] = @{ [4].y = 32, [6].y = 39 @};
|
|
|
+@end example
|
|
|
+
|
|
|
+Along with the capability to specify particular array and structure
|
|
|
+elements to initialize comes the possibility of initializing the same
|
|
|
+element more than once:
|
|
|
+
|
|
|
+@example
|
|
|
+int foo[10] = @{ [4] = 42, [4] = 98 @};
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+In such a case, the last initialization value is retained.
|
|
|
+
|
|
|
+@node Auto Type
|
|
|
+@section Referring to a Type with @code{__auto_type}
|
|
|
+@findex __auto_type
|
|
|
+@findex typeof
|
|
|
+@cindex macros, types of arguments
|
|
|
+
|
|
|
+You can declare a variable copying the type from
|
|
|
+the initializer by using @code{__auto_type} instead of a particular type.
|
|
|
+Here's an example:
|
|
|
+
|
|
|
+@example
|
|
|
+#define max(a,b) \
|
|
|
+ (@{ __auto_type _a = (a); \
|
|
|
+ __auto_type _b = (b); \
|
|
|
+ _a > _b ? _a : _b @})
|
|
|
+@end example
|
|
|
+
|
|
|
+This defines @code{_a} to be of the same type as @code{a}, and
|
|
|
+@code{_b} to be of the same type as @code{b}. This is a useful thing
|
|
|
+to do in a macro that ought to be able to handle any type of data
|
|
|
+(@pxref{Macros and Auto Type}).
|
|
|
+
|
|
|
+The original GNU C method for obtaining the type of a value is to use
|
|
|
+@code{typeof}, which takes as an argument either a value or the name of
|
|
|
+a type. The previous example could also be written as:
|
|
|
+
|
|
|
+@example
|
|
|
+#define max(a,b) \
|
|
|
+ (@{ typeof(a) _a = (a); \
|
|
|
+ typeof(b) _b = (b); \
|
|
|
+ _a > _b ? _a : _b @})
|
|
|
+@end example
|
|
|
+
|
|
|
+@code{typeof} is more flexible than @code{__auto_type}; however, the
|
|
|
+principal use case for @code{typeof} is in variable declarations with
|
|
|
+initialization, which is exactly what @code{__auto_type} handles.
|
|
|
+
|
|
|
+@node Local Variables
|
|
|
+@section Local Variables
|
|
|
+@cindex local variables
|
|
|
+@cindex variables, local
|
|
|
+
|
|
|
+Declaring a variable inside a function definition (@pxref{Function
|
|
|
+Definitions}) makes the variable name @dfn{local} to the containing
|
|
|
+block---that is, the containing pair of braces. More precisely, the
|
|
|
+variable's name is visible starting just after where it appears in the
|
|
|
+declaration, and its visibility continues until the end of the block.
|
|
|
+
|
|
|
+Local variables in C are generally @dfn{automatic} variables: each
|
|
|
+variable's storage exists only from the declaration to the end of the
|
|
|
+block. Execution of the declaration allocates the storage, computes
|
|
|
+the initial value, and stores it in the variable. The end of the
|
|
|
+block deallocates the storage.@footnote{Due to compiler optimizations,
|
|
|
+allocation and deallocation don't necessarily really happen at
|
|
|
+those times.}
|
|
|
+
|
|
|
+@strong{Warning:} Two declarations for the same local variable
|
|
|
+in the same scope are an error.
|
|
|
+
|
|
|
+@strong{Warning:} Automatic variables are stored in the run-time stack.
|
|
|
+The total space for the program's stack may be limited; therefore,
|
|
|
+in using very large arrays, it may be necessary to allocate
|
|
|
+them in some other way to stop the program from crashing.
|
|
|
+
|
|
|
+@strong{Warning:} If the declaration of an automatic variable does not
|
|
|
+specify an initial value, the variable starts out containing garbage.
|
|
|
+In this example, the value printed could be anything at all:
|
|
|
+
|
|
|
+@example
|
|
|
+@{
|
|
|
+ int i;
|
|
|
+
|
|
|
+ printf ("Print junk %d\n", i);
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+In a simple test program, that statement is likely to print 0, simply
|
|
|
+because every process starts with memory zeroed. But don't rely on it
|
|
|
+to be zero---that is erroneous.
|
|
|
+
|
|
|
+@strong{Note:} Make sure to store a value into each local variable (by
|
|
|
+assignment, or by initialization) before referring to its value.
|
|
|
+
|
|
|
+@node File-Scope Variables
|
|
|
+@section File-Scope Variables
|
|
|
+@cindex file-scope variables
|
|
|
+@cindex global variables
|
|
|
+@cindex variables, file-scope
|
|
|
+@cindex variables, global
|
|
|
+
|
|
|
+A variable declaration at the top level in a file (not inside a
|
|
|
+function definition) declares a @dfn{file-scope variable}. Loading a
|
|
|
+program allocates the storage for all the file-scope variables in it,
|
|
|
+and initializes them too.
|
|
|
+
|
|
|
+Each file-scope variable is either @dfn{static} (limited to one
|
|
|
+compilation module) or @dfn{global} (shared with all compilation
|
|
|
+modules in the program). To make the variable static, write the
|
|
|
+keyword @code{static} at the start of the declaration. Omitting
|
|
|
+@code{static} makes the variable global.
|
|
|
+
|
|
|
+The initial value for a file-scope variable can't depend on the
|
|
|
+contents of storage, and can't call any functions.
|
|
|
+
|
|
|
+@example
|
|
|
+int foo = 5; /* @r{Valid.} */
|
|
|
+int bar = foo; /* @r{Invalid!} */
|
|
|
+int bar = sin (1.0); /* @r{Invalid!} */
|
|
|
+@end example
|
|
|
+
|
|
|
+But it can use the address of another file-scope variable:
|
|
|
+
|
|
|
+@example
|
|
|
+int foo;
|
|
|
+int *bar = &foo; /* @r{Valid.} */
|
|
|
+int arr[5];
|
|
|
+int *bar3 = &arr[3]; /* @r{Valid.} */
|
|
|
+int *bar4 = arr + 4; /* @r{Valid.} */
|
|
|
+@end example
|
|
|
+
|
|
|
+It is valid for a module to have multiple declarations for a
|
|
|
+file-scope variable, as long as they are all global or all static, but
|
|
|
+at most one declaration can specify an initial value for it.
|
|
|
+
|
|
|
+@node Static Local Variables
|
|
|
+@section Static Local Variables
|
|
|
+@cindex static local variables
|
|
|
+@cindex variables, static local
|
|
|
+@findex static
|
|
|
+
|
|
|
+The keyword @code{static} in a local variable declaration says to
|
|
|
+allocate the storage for the variable permanently, just like a
|
|
|
+file-scope variable, even if the declaration is within a function.
|
|
|
+
|
|
|
+Here's an example:
|
|
|
+
|
|
|
+@example
|
|
|
+int
|
|
|
+increment_counter ()
|
|
|
+@{
|
|
|
+ static int counter = 0;
|
|
|
+ return ++counter;
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+The scope of the name @code{counter} runs from the declaration to the
|
|
|
+end of the containing block, just like an automatic local variable,
|
|
|
+but its storage is permanent, so the value persists from one call to
|
|
|
+the next. As a result, each call to @code{increment_counter}
|
|
|
+returns a different, unique value.
|
|
|
+
|
|
|
+The initial value of a static local variable has the same limitations
|
|
|
+as for file-scope variables: it can't depend on the contents of
|
|
|
+storage or call any functions. It can use the address of a file-scope
|
|
|
+variable or a static local variable, because those addresses are
|
|
|
+determined before the program runs.
|
|
|
+
|
|
|
+@node Extern Declarations
|
|
|
+@section @code{extern} Declarations
|
|
|
+@cindex @code{extern} declarations
|
|
|
+@cindex declarations, @code{extern}
|
|
|
+@findex extern
|
|
|
+
|
|
|
+An @code{extern} declaration is used to refer to a global variable
|
|
|
+whose principal declaration comes elsewhere---in the same module, or in
|
|
|
+another compilation module. It looks like this:
|
|
|
+
|
|
|
+@example
|
|
|
+extern @var{basetype} @var{decorated-variable};
|
|
|
+@end example
|
|
|
+
|
|
|
+Its meaning is that, in the current scope, the variable name refers to
|
|
|
+the file-scope variable of that name---which needs to be declared in a
|
|
|
+non-@code{extern}, non-@code{static} way somewhere else.
|
|
|
+
|
|
|
+For instance, if one compilation module has this global variable
|
|
|
+declaration
|
|
|
+
|
|
|
+@example
|
|
|
+int error_count = 0;
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+then other compilation modules can specify this
|
|
|
+
|
|
|
+@example
|
|
|
+extern int error_count;
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+to allow reference to the same variable.
|
|
|
+
|
|
|
+The usual place to write an @code{extern} declaration is at top level
|
|
|
+in a source file, but you can write an @code{extern} declaration
|
|
|
+inside a block to make a global or static file-scope variable
|
|
|
+accessible in that block.
|
|
|
+
|
|
|
+Since an @code{extern} declaration does not allocate space for the
|
|
|
+variable, it can omit the size of an array:
|
|
|
+
|
|
|
+@example
|
|
|
+extern int array[];
|
|
|
+@end example
|
|
|
+
|
|
|
+You can use @code{array} normally in all contexts where it is
|
|
|
+converted automatically to a pointer. However, to use it as the
|
|
|
+operand of @code{sizeof} is an error, since the size is unknown.
|
|
|
+
|
|
|
+It is valid to have multiple @code{extern} declarations for the same
|
|
|
+variable, even in the same scope, if they give the same type. They do
|
|
|
+not conflict---they agree. For an array, it is legitimate for some
|
|
|
+@code{extern} declarations can specify the size while others omit it.
|
|
|
+However, if two declarations give different sizes, that is an error.
|
|
|
+
|
|
|
+Likewise, you can use @code{extern} declarations at file scope
|
|
|
+(@pxref{File-Scope Variables}) followed by an ordinary global
|
|
|
+(non-static) declaration of the same variable. They do not conflict,
|
|
|
+because they say compatible things about the same meaning of the variable.
|
|
|
+
|
|
|
+@node Allocating File-Scope
|
|
|
+@section Allocating File-Scope Variables
|
|
|
+@cindex allocation file-scope variables
|
|
|
+@cindex file-scope variables, allocating
|
|
|
+
|
|
|
+Some file-scope declarations allocate space for the variable, and some
|
|
|
+don't.
|
|
|
+
|
|
|
+A file-scope declaration with an initial value @emph{must} allocate
|
|
|
+space for the variable; if there are two of such declarations for the
|
|
|
+same variable, even in different compilation modules, they conflict.
|
|
|
+
|
|
|
+An @code{extern} declaration @emph{never} allocates space for the variable.
|
|
|
+If all the top-level declarations of a certain variable are
|
|
|
+@code{extern}, the variable never gets memory space. If that variable
|
|
|
+is used anywhere in the program, the use will be reported as an error,
|
|
|
+saying that the variable is not defined.
|
|
|
+
|
|
|
+@cindex tentative definition
|
|
|
+A file-scope declaration without an initial value is called a
|
|
|
+@dfn{tentative definition}. This is a strange hybrid: it @emph{can}
|
|
|
+allocate space for the variable, but does not insist. So it causes no
|
|
|
+conflict, no error, if the variable has another declaration that
|
|
|
+allocates space for it, perhaps in another compilation module. But if
|
|
|
+nothing else allocates space for the variable, the tentative
|
|
|
+definition will do it. Any number of compilation modules can declare
|
|
|
+the same variable in this way, and that is sufficient for all of them
|
|
|
+to use the variable.
|
|
|
+
|
|
|
+@c @opindex -fno-common
|
|
|
+@c @opindex --warn_common
|
|
|
+In programs that are very large or have many contributors, it may be
|
|
|
+wise to adopt the convention of never using tentative definitions.
|
|
|
+You can use the compilation option @option{-fno-common} to make them
|
|
|
+an error, or @option{--warn-common} to warn about them.
|
|
|
+
|
|
|
+If a file-scope variable gets its space through a tentative
|
|
|
+definition, it starts out containing all zeros.
|
|
|
+
|
|
|
+@node auto and register
|
|
|
+@section @code{auto} and @code{register}
|
|
|
+@cindex @code{auto} declarations
|
|
|
+@cindex @code{register} declarations
|
|
|
+@findex auto
|
|
|
+@findex register
|
|
|
+
|
|
|
+For historical reasons, you can write @code{auto} or @code{register}
|
|
|
+before a local variable declaration. @code{auto} merely emphasizes
|
|
|
+that the variable isn't static; it changes nothing.
|
|
|
+
|
|
|
+@code{register} suggests to the compiler storing this variable in a
|
|
|
+register. However, GNU C ignores this suggestion, since it can
|
|
|
+choose the best variables to store in registers without any hints.
|
|
|
+
|
|
|
+It is an error to take the address of a variable declared
|
|
|
+@code{register}, so you cannot use the unary @samp{&} operator on it.
|
|
|
+If the variable is an array, you can't use it at all (other than as
|
|
|
+the operand of @code{sizeof}), which makes it rather useless.
|
|
|
+
|
|
|
+@node Omitting Types
|
|
|
+@section Omitting Types in Declarations
|
|
|
+@cindex omitting types in declarations
|
|
|
+
|
|
|
+The syntax of C traditionally allows omitting the data type in a
|
|
|
+declaration if it specifies a storage class, a type qualifier (see the
|
|
|
+next chapter), or @code{auto} or @code{register}. Then the type
|
|
|
+defaults to @code{int}. For example:
|
|
|
+
|
|
|
+@example
|
|
|
+auto foo = 42;
|
|
|
+@end example
|
|
|
+
|
|
|
+This is bad practice; if you see it, fix it.
|
|
|
+
|
|
|
+@node Type Qualifiers
|
|
|
+@chapter Type Qualifiers
|
|
|
+
|
|
|
+A declaration can include type qualifiers to advise the compiler
|
|
|
+about how the variable will be used. There are three different
|
|
|
+qualifiers, @code{const}, @code{volatile} and @code{restrict}. They
|
|
|
+pertain to different issues, so you can use more than one together.
|
|
|
+For instance, @code{const volatile} describes a value that the
|
|
|
+program is not allowed to change, but might have a different value
|
|
|
+each time the program examines it. (This might perhaps be a special
|
|
|
+hardware register, or part of shared memory.)
|
|
|
+
|
|
|
+If you are just learning C, you can skip this chapter.
|
|
|
+
|
|
|
+@menu
|
|
|
+* const:: Variables whose values don't change.
|
|
|
+* volatile:: Variables whose values may be accessed
|
|
|
+ or changed outside of the control of
|
|
|
+ this program.
|
|
|
+* restrict Pointers:: Restricted pointers for code optimization.
|
|
|
+* restrict Pointer Example:: Example of how that works.
|
|
|
+@end menu
|
|
|
+
|
|
|
+@node const
|
|
|
+@section @code{const} Variables and Fields
|
|
|
+@cindex @code{const} variables and fields
|
|
|
+@cindex variables, @code{const}
|
|
|
+@findex const
|
|
|
+
|
|
|
+You can mark a variable as ``constant'' by writing @code{const} in
|
|
|
+front of the declaration. This says to treat any assignment to that
|
|
|
+variable as an error. It may also permit some compiler
|
|
|
+optimizations---for instance, to fetch the value only once to satisfy
|
|
|
+multiple references to it. The construct looks like this:
|
|
|
+
|
|
|
+@example
|
|
|
+const double pi = 3.14159;
|
|
|
+@end example
|
|
|
+
|
|
|
+After this definition, the code can use the variable @code{pi}
|
|
|
+but cannot assign a different value to it.
|
|
|
+
|
|
|
+@example
|
|
|
+pi = 3.0; /* @r{Error!} */
|
|
|
+@end example
|
|
|
+
|
|
|
+Simple variables that are constant can be used for the same purposes
|
|
|
+as enumeration constants, and they are not limited to integers. The
|
|
|
+constantness of the variable propagates into pointers, too.
|
|
|
+
|
|
|
+A pointer type can specify that the @emph{target} is constant. For
|
|
|
+example, the pointer type @code{const double *} stands for a pointer
|
|
|
+to a constant @code{double}. That's the typethat results from taking
|
|
|
+the address of @code{pi}. Such a pointer can't be dereferenced in the
|
|
|
+left side of an assignment.
|
|
|
+
|
|
|
+@example
|
|
|
+*(&pi) = 3.0; /* @r{Error!} */
|
|
|
+@end example
|
|
|
+
|
|
|
+Nonconstant pointers can be converted automatically to constant
|
|
|
+pointers, but not vice versa. For instance,
|
|
|
+
|
|
|
+@example
|
|
|
+const double *cptr;
|
|
|
+double *ptr;
|
|
|
+
|
|
|
+cptr = π /* @r{Valid.} */
|
|
|
+cptr = ptr; /* @r{Valid.} */
|
|
|
+ptr = cptr; /* @r{Error!} */
|
|
|
+ptr = π /* @r{Error!} */
|
|
|
+@end example
|
|
|
+
|
|
|
+This is not an ironclad protection against modifying the value. You
|
|
|
+can always cast the constant pointer to a nonconstant pointer type:
|
|
|
+
|
|
|
+@example
|
|
|
+ptr = (double *)cptr; /* @r{Valid.} */
|
|
|
+ptr = (double *)π /* @r{Valid.} */
|
|
|
+@end example
|
|
|
+
|
|
|
+However, @code{const} provides a way to show that a certain function
|
|
|
+won't modify the data structure whose address is passed to it. Here's
|
|
|
+an example:
|
|
|
+
|
|
|
+@example
|
|
|
+int
|
|
|
+string_length (const char *string)
|
|
|
+@{
|
|
|
+ int count = 0;
|
|
|
+ while (*string++)
|
|
|
+ count++;
|
|
|
+ return count;
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+Using @code{const char *} for the parameter is a way of saying this
|
|
|
+function never modifies the memory of the string itself.
|
|
|
+
|
|
|
+In calling @code{string_length}, you can specify an ordinary
|
|
|
+@code{char *} since that can be converted automatically to @code{const
|
|
|
+char *}.
|
|
|
+
|
|
|
+@node volatile
|
|
|
+@section @code{volatile} Variables and Fields
|
|
|
+@cindex @code{volatile} variables and fields
|
|
|
+@cindex variables, @code{volatile}
|
|
|
+@findex volatile
|
|
|
+
|
|
|
+The GNU C compiler often performs optimizations that eliminate the
|
|
|
+need to write or read a variable. For instance,
|
|
|
+
|
|
|
+@example
|
|
|
+int foo;
|
|
|
+foo = 1;
|
|
|
+foo++;
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+might simply store the value 2 into @code{foo}, without ever storing 1.
|
|
|
+These optimizations can also apply to structure fields in some cases.
|
|
|
+
|
|
|
+If the memory containing @code{foo} is shared with another program,
|
|
|
+or if it is examined asynchronously by hardware, such optimizations
|
|
|
+could confuse the communication. Using @code{volatile} is one way
|
|
|
+to prevent them.
|
|
|
+
|
|
|
+Writing @code{volatile} with the type in a variable or field declaration
|
|
|
+says that the value may be examined or changed for reasons outside the
|
|
|
+control of the program at any moment. Therefore, the program must
|
|
|
+execute in a careful way to assure correct interaction with those
|
|
|
+accesses, whenever they may occur.
|
|
|
+
|
|
|
+The simplest use looks like this:
|
|
|
+
|
|
|
+@example
|
|
|
+volatile int lock;
|
|
|
+@end example
|
|
|
+
|
|
|
+This directs the compiler not to do certain common optimizations on
|
|
|
+use of the variable @code{lock}. All the reads and writes for a volatile
|
|
|
+variable or field are really done, and done in the order specified
|
|
|
+by the source code. Thus, this code:
|
|
|
+
|
|
|
+@example
|
|
|
+lock = 1;
|
|
|
+list = list->next;
|
|
|
+if (lock)
|
|
|
+ lock_broken (&lock);
|
|
|
+lock = 0;
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+really stores the value 1 in @code{lock}, even though there is no
|
|
|
+sign it is really used, and the @code{if} statement reads and
|
|
|
+checks the value of @code{lock}, rather than assuming it is still 1.
|
|
|
+
|
|
|
+A limited amount of optimization can be done, in principle, on
|
|
|
+@code{volatile} variables and fields: multiple references between two
|
|
|
+sequence points (@pxref{Sequence Points}) can be simplified together.
|
|
|
+
|
|
|
+Use of @code{volatile} does not eliminate the flexibility in ordering
|
|
|
+the computation of the operands of most operators. For instance, in
|
|
|
+@code{lock + foo ()}, the order of accessing @code{lock} and calling
|
|
|
+@code{foo} is not specified, so they may be done in either order; the
|
|
|
+fact that @code{lock} is @code{volatile} has no effect on that.
|
|
|
+
|
|
|
+@node restrict Pointers
|
|
|
+@section @code{restrict}-Qualified Pointers
|
|
|
+@cindex @code{restrict} pointers
|
|
|
+@cindex pointers, @code{restrict}-qualified
|
|
|
+@findex restrict
|
|
|
+
|
|
|
+You can declare a pointer as ``restricted'' using the @code{restrict}
|
|
|
+type qualifier, like this:
|
|
|
+
|
|
|
+@example
|
|
|
+int *restrict p = x;
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+This enables better optimization of code that uses the pointer.
|
|
|
+
|
|
|
+If @code{p} is declared with @code{restrict}, and then the code
|
|
|
+references the object that @code{p} points to (using @code{*p} or
|
|
|
+@code{p[@var{i}]}), the @code{restrict} declaration promises that the
|
|
|
+code will not access that object in any other way---only through
|
|
|
+@code{p}.
|
|
|
+
|
|
|
+For instance, it means the code must not use another pointer
|
|
|
+to access the same space, as shown here:
|
|
|
+
|
|
|
+@example
|
|
|
+int *restrict p = @var{whatever};
|
|
|
+int *q = p;
|
|
|
+foo (*p, *q);
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+That contradicts the @code{restrict} promise by accessing the object
|
|
|
+that @code{p} points to using @code{q}, which bypasses @code{p}.
|
|
|
+Likewise, it must not do this:
|
|
|
+
|
|
|
+@example
|
|
|
+int *restrict p = @var{whatever};
|
|
|
+struct @{ int *a, *b; @} s;
|
|
|
+s.a = p;
|
|
|
+foo (*p, *s.a);
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+This example uses a structure field instead of the variable @code{q}
|
|
|
+to hold the other pointer, and that contradicts the promise just the
|
|
|
+same.
|
|
|
+
|
|
|
+The keyword @code{restrict} also promises that @code{p} won't point to
|
|
|
+the allocated space of any automatic or static variable. So the code
|
|
|
+must not do this:
|
|
|
+
|
|
|
+@example
|
|
|
+int a;
|
|
|
+int *restrict p = &a;
|
|
|
+foo (*p, a);
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+because that does direct access to the object (@code{a}) that @code{p}
|
|
|
+points to, which bypasses @code{p}.
|
|
|
+
|
|
|
+If the code makes such promises with @code{restrict} then breaks them,
|
|
|
+execution is unpredictable.
|
|
|
+
|
|
|
+@node restrict Pointer Example
|
|
|
+@section @code{restrict} Pointer Example
|
|
|
+
|
|
|
+Here are examples where @code{restrict} enables real optimization.
|
|
|
+
|
|
|
+In this example, @code{restrict} assures GCC that the array @code{out}
|
|
|
+points to does not overlap with the array @code{in} points to.
|
|
|
+
|
|
|
+@example
|
|
|
+void
|
|
|
+process_data (const char *in,
|
|
|
+ char * restrict out,
|
|
|
+ size_t size)
|
|
|
+@{
|
|
|
+ for (i = 0; i < size; i++)
|
|
|
+ out[i] = in[i] + in[i + 1];
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+Here's a simple tree structure, where each tree node holds data of
|
|
|
+type @code{PAYLOAD} plus two subtrees.
|
|
|
+
|
|
|
+@example
|
|
|
+struct foo
|
|
|
+ @{
|
|
|
+ PAYLOAD payload;
|
|
|
+ struct foo *left;
|
|
|
+ struct foo *right;
|
|
|
+ @};
|
|
|
+@end example
|
|
|
+
|
|
|
+Now here's a function to null out both pointers in the @code{left}
|
|
|
+subtree.
|
|
|
+
|
|
|
+@example
|
|
|
+void
|
|
|
+null_left (struct foo *a)
|
|
|
+@{
|
|
|
+ a->left->left = NULL;
|
|
|
+ a->left->right = NULL;
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+Since @code{*a} and @code{*a->left} have the same data type,
|
|
|
+they could legitimately alias (@pxref{Aliasing}). Therefore,
|
|
|
+the compiled code for @code{null_left} must read @code{a->left}
|
|
|
+again from memory when executing the second assignment statement.
|
|
|
+
|
|
|
+We can enable optimization, so that it does not need to read
|
|
|
+@code{a->left} again, by writing @code{null_left} this in a less
|
|
|
+obvious way.
|
|
|
+
|
|
|
+@example
|
|
|
+void
|
|
|
+null_left (struct foo *a)
|
|
|
+@{
|
|
|
+ struct foo *b = a->left;
|
|
|
+ b->left = NULL;
|
|
|
+ b->right = NULL;
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+A more elegant way to fix this is with @code{restrict}.
|
|
|
+
|
|
|
+@example
|
|
|
+void
|
|
|
+null_left (struct foo *restrict a)
|
|
|
+@{
|
|
|
+ a->left->left = NULL;
|
|
|
+ a->left->right = NULL;
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+Declaring @code{a} as @code{restrict} asserts that other pointers such
|
|
|
+as @code{a->left} will not point to the same memory space as @code{a}.
|
|
|
+Therefore, the memory location @code{a->left->left} cannot be the same
|
|
|
+memory as @code{a->left}. Knowing this, the compiled code may avoid
|
|
|
+reloading @code{a->left} for the second statement.
|
|
|
+
|
|
|
+@node Functions
|
|
|
+@chapter Functions
|
|
|
+@cindex functions
|
|
|
+
|
|
|
+We have already presented many examples of functions, so if you've
|
|
|
+read this far, you basically understand the concept of a function. It
|
|
|
+is vital, nonetheless, to have a chapter in the manual that collects
|
|
|
+all the information about functions.
|
|
|
+
|
|
|
+@menu
|
|
|
+* Function Definitions:: Writing the body of a function.
|
|
|
+* Function Declarations:: Declaring the interface of a function.
|
|
|
+* Function Calls:: Using functions.
|
|
|
+* Function Call Semantics:: Call-by-value argument passing.
|
|
|
+* Function Pointers:: Using references to functions.
|
|
|
+* The main Function:: Where execution of a GNU C program begins.
|
|
|
+* Advanced Definitions:: Advanced features of function definitions.
|
|
|
+* Obsolete Definitions:: Obsolete features still used
|
|
|
+ in function definitions in old code.
|
|
|
+@end menu
|
|
|
+
|
|
|
+@node Function Definitions
|
|
|
+@section Function Definitions
|
|
|
+@cindex function definitions
|
|
|
+@cindex defining functions
|
|
|
+
|
|
|
+We have already presented many examples of function definitions. To
|
|
|
+summarize the rules, a function definition looks like this:
|
|
|
+
|
|
|
+@example
|
|
|
+@var{returntype}
|
|
|
+@var{functionname} (@var{parm_declarations}@r{@dots{}})
|
|
|
+@{
|
|
|
+ @var{body}
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+The part before the open-brace is called the @dfn{function header}.
|
|
|
+
|
|
|
+Write @code{void} as the @var{returntype} if the function does
|
|
|
+not return a value.
|
|
|
+
|
|
|
+@menu
|
|
|
+* Function Parameter Variables:: Syntax and semantics
|
|
|
+ of function parameters.
|
|
|
+* Forward Function Declarations:: Functions can only be called after
|
|
|
+ they have been defined or declared.
|
|
|
+* Static Functions:: Limiting visibility of a function.
|
|
|
+* Arrays as Parameters:: Functions that accept array arguments.
|
|
|
+* Structs as Parameters:: Functions that accept structure arguments.
|
|
|
+@end menu
|
|
|
+
|
|
|
+@node Function Parameter Variables
|
|
|
+@subsection Function Parameter Variables
|
|
|
+@cindex function parameter variables
|
|
|
+@cindex parameter variables in functions
|
|
|
+@cindex parameter list
|
|
|
+
|
|
|
+A function parameter variable is a local variable (@pxref{Local
|
|
|
+Variables}) used within the function to store the value passed as an
|
|
|
+argument in a call to the function. Usually we say ``function
|
|
|
+parameter'' or ``parameter'' for short, not mentioning the fact that
|
|
|
+it's a variable.
|
|
|
+
|
|
|
+We declare these variables in the beginning of the function
|
|
|
+definition, in the @dfn{parameter list}. For example,
|
|
|
+
|
|
|
+@example
|
|
|
+fib (int n)
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+has a parameter list with one function parameter @code{n}, which has
|
|
|
+type @code{int}.
|
|
|
+
|
|
|
+Function parameter declarations differ from ordinary variable
|
|
|
+declarations in several ways:
|
|
|
+
|
|
|
+@itemize @bullet
|
|
|
+@item
|
|
|
+Inside the function definition header, commas separate parameter
|
|
|
+declarations, and each parameter needs a complete declaration
|
|
|
+including the type. For instance, if a function @code{foo} has two
|
|
|
+@code{int} parameters, write this:
|
|
|
+
|
|
|
+@example
|
|
|
+foo (int a, int b)
|
|
|
+@end example
|
|
|
+
|
|
|
+You can't share the common @code{int} between the two declarations:
|
|
|
+
|
|
|
+@example
|
|
|
+foo (int a, b) /* @r{Invalid!} */
|
|
|
+@end example
|
|
|
+
|
|
|
+@item
|
|
|
+A function parameter variable is initialized to whatever value is
|
|
|
+passed in the function call, so its declaration cannot specify an
|
|
|
+initial value.
|
|
|
+
|
|
|
+@item
|
|
|
+Writing an array type in a function parameter declaration has the
|
|
|
+effect of declaring it as a pointer. The size specified for the array
|
|
|
+has no effect at all, and we normally omit the size. Thus,
|
|
|
+
|
|
|
+@example
|
|
|
+foo (int a[5])
|
|
|
+foo (int a[])
|
|
|
+foo (int *a)
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+are equivalent.
|
|
|
+
|
|
|
+@item
|
|
|
+The scope of the parameter variables is the entire function body,
|
|
|
+notwithstanding the fact that they are written in the function header,
|
|
|
+which is just outside the function body.
|
|
|
+@end itemize
|
|
|
+
|
|
|
+If a function has no parameters, it would be most natural for the
|
|
|
+list of parameters in its definition to be empty. But that, in C, has
|
|
|
+a special meaning for historical reasons: ``Do not check that calls to
|
|
|
+this function have the right number of arguments.'' Thus,
|
|
|
+
|
|
|
+@example
|
|
|
+int
|
|
|
+foo ()
|
|
|
+@{
|
|
|
+ return 5;
|
|
|
+@}
|
|
|
+
|
|
|
+int
|
|
|
+bar (int x)
|
|
|
+@{
|
|
|
+ return foo (x);
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+would not report a compilation error in passing @code{x} as an
|
|
|
+argument to @code{foo}. By contrast,
|
|
|
+
|
|
|
+@example
|
|
|
+int
|
|
|
+foo (void)
|
|
|
+@{
|
|
|
+ return 5;
|
|
|
+@}
|
|
|
+
|
|
|
+int
|
|
|
+bar (int x)
|
|
|
+@{
|
|
|
+ return foo (x);
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+would report an error because @code{foo} is supposed to receive
|
|
|
+no arguments.
|
|
|
+
|
|
|
+@node Forward Function Declarations
|
|
|
+@subsection Forward Function Declarations
|
|
|
+@cindex forward function declarations
|
|
|
+@cindex function declarations, forward
|
|
|
+
|
|
|
+The order of the function definitions in the source code makes no
|
|
|
+difference, except that each function needs to be defined or declared
|
|
|
+before code uses it.
|
|
|
+
|
|
|
+The definition of a function also declares its name for the rest of
|
|
|
+the containing scope. But what if you want to call the function
|
|
|
+before its definition? To permit that, write a compatible declaration
|
|
|
+of the same function, before the first call. A declaration that
|
|
|
+prefigures a subsequent definition in this way is called a
|
|
|
+@dfn{forward declaration}. The function declaration can be at top
|
|
|
+@c ??? file scope
|
|
|
+level or within a block, and it applies until the end of the containing
|
|
|
+scope.
|
|
|
+
|
|
|
+@xref{Function Declarations}, for more information about these
|
|
|
+declarations.
|
|
|
+
|
|
|
+@node Static Functions
|
|
|
+@subsection Static Functions
|
|
|
+@cindex static functions
|
|
|
+@cindex functions, static
|
|
|
+@findex static
|
|
|
+
|
|
|
+The keyword @code{static} in a function definition limits the
|
|
|
+visibility of the name to the current compilation module. (That's the
|
|
|
+same thing @code{static} does in variable declarations;
|
|
|
+@pxref{File-Scope Variables}.) For instance, if one compilation module
|
|
|
+contains this code:
|
|
|
+
|
|
|
+@example
|
|
|
+static int
|
|
|
+foo (void)
|
|
|
+@{
|
|
|
+ @r{@dots{}}
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+then the code of that compilation module can call @code{foo} anywhere
|
|
|
+after the definition, but other compilation modules cannot refer to it
|
|
|
+at all.
|
|
|
+
|
|
|
+@cindex forward declaration
|
|
|
+@cindex static function, declaration
|
|
|
+To call @code{foo} before its definition, it needs a forward
|
|
|
+declaration, which should use @code{static} since the function
|
|
|
+definition does. For this function, it looks like this:
|
|
|
+
|
|
|
+@example
|
|
|
+static int foo (void);
|
|
|
+@end example
|
|
|
+
|
|
|
+It is generally wise to use @code{static} on the definitions of
|
|
|
+functions that won't be called from outside the same compilation
|
|
|
+module. This makes sure that calls are not added in other modules.
|
|
|
+If programmers decide to change the function's calling convention, or
|
|
|
+understand all the consequences of its use, they will only have to
|
|
|
+check for calls in the same compilation module.
|
|
|
+
|
|
|
+@node Arrays as Parameters
|
|
|
+@subsection Arrays as Parameters
|
|
|
+@cindex array as parameters
|
|
|
+@cindex functions with array parameters
|
|
|
+
|
|
|
+Arrays in C are not first-class objects: it is impossible to copy
|
|
|
+them. So they cannot be passed as arguments like other values.
|
|
|
+@xref{Limitations of C Arrays}. Rather, array parameters work in
|
|
|
+a special way.
|
|
|
+
|
|
|
+@menu
|
|
|
+* Array Parm Pointer::
|
|
|
+* Passing Array Args::
|
|
|
+* Array Parm Qualifiers::
|
|
|
+@end menu
|
|
|
+
|
|
|
+@node Array Parm Pointer
|
|
|
+@subsubsection Array parameters are pointers
|
|
|
+
|
|
|
+Declaring a function parameter variable as an array really gives it a
|
|
|
+pointer type. C does this because an expression with array type, if
|
|
|
+used as an argument in a function call, is converted automatically to
|
|
|
+a pointer (to the zeroth element of the array). If you declare the
|
|
|
+corresponding parameter as an ``array'', it will work correctly with
|
|
|
+the pointer value that really gets passed.
|
|
|
+
|
|
|
+This relates to the fact that C does not check array bounds in access
|
|
|
+to elements of the array (@pxref{Accessing Array Elements}).
|
|
|
+
|
|
|
+For example, in this function,
|
|
|
+
|
|
|
+@example
|
|
|
+void
|
|
|
+clobber4 (int array[20])
|
|
|
+@{
|
|
|
+ array[4] = 0;
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+the parameter @code{array}'s real type is @code{int *}; the specified
|
|
|
+length, 20, has no effect on the program. You can leave out the length
|
|
|
+and write this:
|
|
|
+
|
|
|
+@example
|
|
|
+void
|
|
|
+clobber4 (int array[])
|
|
|
+@{
|
|
|
+ array[4] = 0;
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+or write the parameter declaration explicitly as a pointer:
|
|
|
+
|
|
|
+@example
|
|
|
+void
|
|
|
+clobber4 (int *array)
|
|
|
+@{
|
|
|
+ array[4] = 0;
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+They are all equivalent.
|
|
|
+
|
|
|
+@node Passing Array Args
|
|
|
+@subsubsection Passing array arguments
|
|
|
+
|
|
|
+ The function call passes this pointer by
|
|
|
+value, like all argument values in C@. However, the result is
|
|
|
+paradoxical in that the array itself is passed by reference: its
|
|
|
+contents are treated as shared memory---shared between the caller and
|
|
|
+the called function, that is. When @code{clobber4} assigns to element
|
|
|
+4 of @code{array}, the effect is to alter element 4 of the array
|
|
|
+specified in the call.
|
|
|
+
|
|
|
+@example
|
|
|
+#include <stddef.h> /* @r{Defines @code{NULL}.} */
|
|
|
+#include <stdlib.h> /* @r{Declares @code{malloc},} */
|
|
|
+ /* @r{Defines @code{EXIT_SUCCESS}.} */
|
|
|
+
|
|
|
+int
|
|
|
+main (void)
|
|
|
+@{
|
|
|
+ int data[] = @{1, 2, 3, 4, 5, 6@};
|
|
|
+ int i;
|
|
|
+
|
|
|
+ /* @r{Show the initial value of element 4.} */
|
|
|
+ for (i = 0; i < 6; i++)
|
|
|
+ printf ("data[%d] = %d\n", i, data[i]);
|
|
|
+
|
|
|
+ printf ("\n");
|
|
|
+
|
|
|
+ clobber4 (data);
|
|
|
+
|
|
|
+ /* @r{Show that element 4 has been changed.} */
|
|
|
+ for (i = 0; i < 6; i++)
|
|
|
+ printf ("data[%d] = %d\n", i, data[i]);
|
|
|
+
|
|
|
+ printf ("\n");
|
|
|
+
|
|
|
+ return EXIT_SUCCESS;
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+shows that @code{data[4]} has become zero after the call to
|
|
|
+@code{clobber4}.
|
|
|
+
|
|
|
+The array @code{data} has 6 elements, but passing it to a function
|
|
|
+whose argument type is written as @code{int [20]} is not an error,
|
|
|
+because that really stands for @code{int *}. The pointer that is the
|
|
|
+real argument carries no indication of the length of the array it
|
|
|
+points into. It is not required to point to the beginning of the
|
|
|
+array, either. For instance,
|
|
|
+
|
|
|
+@example
|
|
|
+clobber4 (data+1);
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+passes an ``array'' that starts at element 1 of @code{data}, and the
|
|
|
+effect is to zero @code{data[5]} instead of @code{data[4]}.
|
|
|
+
|
|
|
+If all calls to the function will provide an array of a particular
|
|
|
+size, you can specify the size of the array to be @code{static}:
|
|
|
+
|
|
|
+@example
|
|
|
+void
|
|
|
+clobber4 (int array[static 20])
|
|
|
+@r{@dots{}}
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+This is a promise to the compiler that the function will always be
|
|
|
+called with an array of 20 elements, so that the compiler can optimize
|
|
|
+code accordingly. If the code breaks this promise and calls the
|
|
|
+function with, for example, a shorter array, unpredictable things may
|
|
|
+happen.
|
|
|
+
|
|
|
+@node Array Parm Qualifiers
|
|
|
+@subsubsection Type qualifiers on array parameters
|
|
|
+
|
|
|
+You can use the type qualifiers @code{const}, @code{restrict}, and
|
|
|
+@code{volatile} with array parameters; for example:
|
|
|
+
|
|
|
+@example
|
|
|
+void
|
|
|
+clobber4 (volatile int array[20])
|
|
|
+@r{@dots{}}
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+denotes that @code{array} is equivalent to a pointer to a volatile
|
|
|
+@code{int}. Alternatively:
|
|
|
+
|
|
|
+@example
|
|
|
+void
|
|
|
+clobber4 (int array[const 20])
|
|
|
+@r{@dots{}}
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+makes the array parameter equivalent to a constant pointer to an
|
|
|
+@code{int}. If we want the @code{clobber4} function to succeed, it
|
|
|
+would not make sense to write
|
|
|
+
|
|
|
+@example
|
|
|
+void
|
|
|
+clobber4 (const int array[20])
|
|
|
+@r{@dots{}}
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+as this would tell the compiler that the parameter should point to an
|
|
|
+array of constant @code{int} values, and then we would not be able to
|
|
|
+store zeros in them.
|
|
|
+
|
|
|
+In a function with multiple array parameters, you can use @code{restrict}
|
|
|
+to tell the compiler that each array parameter passed in will be distinct:
|
|
|
+
|
|
|
+@example
|
|
|
+void
|
|
|
+foo (int array1[restrict 10], int array2[restrict 10])
|
|
|
+@r{@dots{}}
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+Using @code{restrict} promises the compiler that callers will
|
|
|
+not pass in the same array for more than one @code{restrict} array
|
|
|
+parameter. Knowing this enables the compiler to perform better code
|
|
|
+optimization. This is the same effect as using @code{restrict}
|
|
|
+pointers (@pxref{restrict Pointers}), but makes it clear when reading
|
|
|
+the code that an array of a specific size is expected.
|
|
|
+
|
|
|
+@node Structs as Parameters
|
|
|
+@subsection Functions That Accept Structure Arguments
|
|
|
+
|
|
|
+Structures in GNU C are first-class objects, so using them as function
|
|
|
+parameters and arguments works in the natural way. This function
|
|
|
+@code{swapfoo} takes a @code{struct foo} with two fields as argument,
|
|
|
+and returns a structure of the same type but with the fields
|
|
|
+exchanged.
|
|
|
+
|
|
|
+@example
|
|
|
+struct foo @{ int a, b; @};
|
|
|
+
|
|
|
+struct foo x;
|
|
|
+
|
|
|
+struct foo
|
|
|
+swapfoo (struct foo inval)
|
|
|
+@{
|
|
|
+ struct foo outval;
|
|
|
+ outval.a = inval.b;
|
|
|
+ outval.b = inval.a;
|
|
|
+ return outval;
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+This simpler definition of @code{swapfoo} avoids using a local
|
|
|
+variable to hold the result about to be return, by using a structure
|
|
|
+constructor (@pxref{Structure Constructors}), like this:
|
|
|
+
|
|
|
+@example
|
|
|
+struct foo
|
|
|
+swapfoo (struct foo inval)
|
|
|
+@{
|
|
|
+ return (struct foo) @{ inval.b, inval.a @};
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+It is valid to define a structure type in a function's parameter list,
|
|
|
+as in
|
|
|
+
|
|
|
+@example
|
|
|
+int
|
|
|
+frob_bar (struct bar @{ int a, b; @} inval)
|
|
|
+@{
|
|
|
+ @var{body}
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+and @var{body} can access the fields of @var{inval} since the
|
|
|
+structure type @code{struct bar} is defined for the whole function
|
|
|
+body. However, there is no way to create a @code{struct bar} argument
|
|
|
+to pass to @code{frob_bar}, except with kludges. As a result,
|
|
|
+defining a structure type in a parameter list is useless in practice.
|
|
|
+
|
|
|
+@node Function Declarations
|
|
|
+@section Function Declarations
|
|
|
+@cindex function declarations
|
|
|
+@cindex declararing functions
|
|
|
+
|
|
|
+To call a function, or use its name as a pointer, a @dfn{function
|
|
|
+declaration} for the function name must be in effect at that point in
|
|
|
+the code. The function's definition serves as a declaration of that
|
|
|
+function for the rest of the containing scope, but to use the function
|
|
|
+in code before the definition, or from another compilation module, a
|
|
|
+separate function declaration must precede the use.
|
|
|
+
|
|
|
+A function declaration looks like the start of a function definition.
|
|
|
+It begins with the return value type (@code{void} if none) and the
|
|
|
+function name, followed by argument declarations in parentheses
|
|
|
+(though these can sometimes be omitted). But that's as far as the
|
|
|
+similarity goes: instead of the function body, the declaration uses a
|
|
|
+semicolon.
|
|
|
+
|
|
|
+@cindex function prototype
|
|
|
+@cindex prototype of a function
|
|
|
+A declaration that specifies argument types is called a @dfn{function
|
|
|
+prototype}. You can include the argument names or omit them. The
|
|
|
+names, if included in the declaration, have no effect, but they may
|
|
|
+serve as documentation.
|
|
|
+
|
|
|
+This form of prototype specifies fixed argument types:
|
|
|
+
|
|
|
+@example
|
|
|
+@var{rettype} @var{function} (@var{argtypes}@r{@dots{}});
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+This form says the function takes no arguments:
|
|
|
+
|
|
|
+@example
|
|
|
+@var{rettype} @var{function} (void);
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+This form declares types for some arguments, and allows additional
|
|
|
+arguments whose types are not specified:
|
|
|
+
|
|
|
+@example
|
|
|
+@var{rettype} @var{function} (@var{argtypes}@r{@dots{}}, ...);
|
|
|
+@end example
|
|
|
+
|
|
|
+For a parameter that's an array of variable length, you can write
|
|
|
+its declaration with @samp{*} where the ``length'' of the array would
|
|
|
+normally go; for example, these are all equivalent.
|
|
|
+
|
|
|
+@example
|
|
|
+double maximum (int n, int m, double a[n][m]);
|
|
|
+double maximum (int n, int m, double a[*][*]);
|
|
|
+double maximum (int n, int m, double a[ ][*]);
|
|
|
+double maximum (int n, int m, double a[ ][m]);
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+The old-fashioned form of declaration, which is not a prototype, says
|
|
|
+nothing about the types of arguments or how many they should be:
|
|
|
+
|
|
|
+@example
|
|
|
+@var{rettype} @var{function} ();
|
|
|
+@end example
|
|
|
+
|
|
|
+@strong{Warning:} Arguments passed to a function declared without a
|
|
|
+prototype are converted with the default argument promotions
|
|
|
+(@pxref{Argument Promotions}. Likewise for additional arguments whose
|
|
|
+types are unspecified.
|
|
|
+
|
|
|
+Function declarations are usually written at the top level in a source file,
|
|
|
+but you can also put them inside code blocks. Then the function name
|
|
|
+is visible for the rest of the containing scope. For example:
|
|
|
+
|
|
|
+@example
|
|
|
+void
|
|
|
+foo (char *file_name)
|
|
|
+@{
|
|
|
+ void save_file (char *);
|
|
|
+ save_file (file_name);
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+If another part of the code tries to call the function
|
|
|
+@code{save_file}, this declaration won't be in effect there. So the
|
|
|
+function will get an implicit declaration of the form @code{extern int
|
|
|
+save_file ();}. That conflicts with the explicit declaration
|
|
|
+here, and the discrepancy generates a warning.
|
|
|
+
|
|
|
+The syntax of C traditionally allows omitting the data type in a
|
|
|
+function declaration if it specifies a storage class or a qualifier.
|
|
|
+Then the type defaults to @code{int}. For example:
|
|
|
+
|
|
|
+@example
|
|
|
+static foo (double x);
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+defaults the return type to @code{int}.
|
|
|
+This is bad practice; if you see it, fix it.
|
|
|
+
|
|
|
+Calling a function that is undeclared has the effect of an creating
|
|
|
+@dfn{implicit} declaration in the innermost containing scope,
|
|
|
+equivalent to this:
|
|
|
+
|
|
|
+@example
|
|
|
+extern int @dfn{function} ();
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+This declaration says that the function returns @code{int} but leaves
|
|
|
+its argument types unspecified. If that does not accurately fit the
|
|
|
+function, then the program @strong{needs} an explicit declaration of
|
|
|
+the function with argument types in order to call it correctly.
|
|
|
+
|
|
|
+Implicit declarations are deprecated, and a function call that creates one
|
|
|
+causes a warning.
|
|
|
+
|
|
|
+@node Function Calls
|
|
|
+@section Function Calls
|
|
|
+@cindex function calls
|
|
|
+@cindex calling functions
|
|
|
+
|
|
|
+Starting a program automatically calls the function named @code{main}
|
|
|
+(@pxref{The main Function}). Aside from that, a function does nothing
|
|
|
+except when it is @dfn{called}. That occurs during the execution of a
|
|
|
+function-call expression specifying that function.
|
|
|
+
|
|
|
+A function-call expression looks like this:
|
|
|
+
|
|
|
+@example
|
|
|
+@var{function} (@var{arguments}@r{@dots{}})
|
|
|
+@end example
|
|
|
+
|
|
|
+Most of the time, @var{function} is a function name. However, it can
|
|
|
+also be an expression with a function pointer value; that way, the
|
|
|
+program can determine at run time which function to call.
|
|
|
+
|
|
|
+The @var{arguments} are a series of expressions separated by commas.
|
|
|
+Each expression specifies one argument to pass to the function.
|
|
|
+
|
|
|
+The list of arguments in a function call looks just like use of the
|
|
|
+comma operator (@pxref{Comma Operator}), but the fact that it fills
|
|
|
+the parentheses of a function call gives it a different meaning.
|
|
|
+
|
|
|
+Here's an example of a function call, taken from an example near the
|
|
|
+beginning (@pxref{Complete Program}).
|
|
|
+
|
|
|
+@example
|
|
|
+printf ("Fibonacci series item %d is %d\n",
|
|
|
+ 19, fib (19));
|
|
|
+@end example
|
|
|
+
|
|
|
+The three arguments given to @code{printf} are a constant string, the
|
|
|
+integer 19, and the integer returned by @code{fib (19)}.
|
|
|
+
|
|
|
+@node Function Call Semantics
|
|
|
+@section Function Call Semantics
|
|
|
+@cindex function call semantics
|
|
|
+@cindex semantics of function calls
|
|
|
+@cindex call-by-value
|
|
|
+
|
|
|
+The meaning of a function call is to compute the specified argument
|
|
|
+expressions, convert their values according to the function's
|
|
|
+declaration, then run the function giving it copies of the converted
|
|
|
+values. (This method of argument passing is known as
|
|
|
+@dfn{call-by-value}.) When the function finishes, the value it
|
|
|
+returns becomes the value of the function-call expression.
|
|
|
+
|
|
|
+Call-by-value implies that an assignment to the function argument
|
|
|
+variable has no direct effect on the caller. For instance,
|
|
|
+
|
|
|
+@example
|
|
|
+#include <stdlib.h> /* @r{Defines @code{EXIT_SUCCESS}.} */
|
|
|
+#include <stdio.h> /* @r{Declares @code{printf}.} */
|
|
|
+
|
|
|
+void
|
|
|
+subroutine (int x)
|
|
|
+@{
|
|
|
+ x = 5;
|
|
|
+@}
|
|
|
+
|
|
|
+void
|
|
|
+main (void)
|
|
|
+@{
|
|
|
+ int y = 20;
|
|
|
+ subroutine (y);
|
|
|
+ printf ("y is %d\n", y);
|
|
|
+ return EXIT_SUCCESS;
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+prints @samp{y is 20}. Calling @code{subroutine} initializes @code{x}
|
|
|
+from the value of @code{y}, but this does not establish any other
|
|
|
+relationship between the two variables. Thus, the assignment to
|
|
|
+@code{x}, inside @code{subroutine}, changes only @emph{that} @code{x}.
|
|
|
+
|
|
|
+If an argument's type is specified by the function's declaration, the
|
|
|
+function call converts the argument expression to that type if
|
|
|
+possible. If the conversion is impossible, that is an error.
|
|
|
+
|
|
|
+If the function's declaration doesn't specify the type of that
|
|
|
+argument, then the @emph{default argument promotions} apply.
|
|
|
+@xref{Argument Promotions}.
|
|
|
+
|
|
|
+@node Function Pointers
|
|
|
+@section Function Pointers
|
|
|
+@cindex function pointers
|
|
|
+@cindex pointers to functions
|
|
|
+
|
|
|
+A function name refers to a fixed function. Sometimes it is useful to
|
|
|
+call a function to be determined at run time; to do this, you can use
|
|
|
+a @dfn{function pointer value} that points to the chosen function
|
|
|
+(@pxref{Pointers}).
|
|
|
+
|
|
|
+Pointer-to-function types can be used to declare variables and other
|
|
|
+data, including array elements, structure fields, and union
|
|
|
+alternatives. They can also be used for function arguments and return
|
|
|
+values. These types have the peculiarity that they are never
|
|
|
+converted automatically to @code{void *} or vice versa. However, you
|
|
|
+can do that conversion with a cast.
|
|
|
+
|
|
|
+@menu
|
|
|
+* Declaring Function Pointers:: How to declare a pointer to a function.
|
|
|
+* Assigning Function Pointers:: How to assign values to function pointers.
|
|
|
+* Calling Function Pointers:: How to call functions through pointers.
|
|
|
+@end menu
|
|
|
+
|
|
|
+@node Declaring Function Pointers
|
|
|
+@subsection Declaring Function Pointers
|
|
|
+@cindex declaring function pointers
|
|
|
+@cindex function pointers, declaring
|
|
|
+
|
|
|
+The declaration of a function pointer variable (or structure field)
|
|
|
+looks almost like a function declaration, except it has an additional
|
|
|
+@samp{*} just before the variable name. Proper nesting requires a
|
|
|
+pair of parentheses around the two of them. For instance, @code{int
|
|
|
+(*a) ();} says, ``Declare @code{a} as a pointer such that @code{*a} is
|
|
|
+an @code{int}-returning function.''
|
|
|
+
|
|
|
+Contrast these three declarations:
|
|
|
+
|
|
|
+@example
|
|
|
+/* @r{Declare a function returning @code{char *}.} */
|
|
|
+char *a (char *);
|
|
|
+/* @r{Declare a pointer to a function returning @code{char}.} */
|
|
|
+char (*a) (char *);
|
|
|
+/* @r{Declare a pointer to a function returning @code{char *}.} */
|
|
|
+char *(*a) (char *);
|
|
|
+@end example
|
|
|
+
|
|
|
+The possible argument types of the function pointed to are the same
|
|
|
+as in a function declaration. You can write a prototype
|
|
|
+that specifies all the argument types:
|
|
|
+
|
|
|
+@example
|
|
|
+@var{rettype} (*@var{function}) (@var{arguments}@r{@dots{}});
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+or one that specifies some and leaves the rest unspecified:
|
|
|
+
|
|
|
+@example
|
|
|
+@var{rettype} (*@var{function}) (@var{arguments}@r{@dots{}}, ...);
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+or one that says there are no arguments:
|
|
|
+
|
|
|
+@example
|
|
|
+@var{rettype} (*@var{function}) (void);
|
|
|
+@end example
|
|
|
+
|
|
|
+You can also write a non-prototype declaration that says
|
|
|
+nothing about the argument types:
|
|
|
+
|
|
|
+@example
|
|
|
+@var{rettype} (*@var{function}) ();
|
|
|
+@end example
|
|
|
+
|
|
|
+For example, here's a declaration for a variable that should
|
|
|
+point to some arithmetic function that operates on two @code{double}s:
|
|
|
+
|
|
|
+@example
|
|
|
+double (*binary_op) (double, double);
|
|
|
+@end example
|
|
|
+
|
|
|
+Structure fields, union alternatives, and array elements can be
|
|
|
+function pointers; so can parameter variables. The function pointer
|
|
|
+declaration construct can also be combined with other operators
|
|
|
+allowed in declarations. For instance,
|
|
|
+
|
|
|
+@example
|
|
|
+int **(*foo)();
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+declares @code{foo} as a pointer to a function that returns
|
|
|
+type @code{int **}, and
|
|
|
+
|
|
|
+@example
|
|
|
+int **(*foo[30])();
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+declares @code{foo} as an array of 30 pointers to functions that
|
|
|
+return type @code{int **}.
|
|
|
+
|
|
|
+@example
|
|
|
+int **(**foo)();
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+declares @code{foo} as a pointer to a pointer to a function that
|
|
|
+returns type @code{int **}.
|
|
|
+
|
|
|
+@node Assigning Function Pointers
|
|
|
+@subsection Assigning Function Pointers
|
|
|
+@cindex assigning function pointers
|
|
|
+@cindex function pointers, assigning
|
|
|
+
|
|
|
+Assuming we have declared the variable @code{binary_op} as in the
|
|
|
+previous section, giving it a value requires a suitable function to
|
|
|
+use. So let's define a function suitable for the variable to point
|
|
|
+to. Here's one:
|
|
|
+
|
|
|
+@example
|
|
|
+double
|
|
|
+double_add (double a, double b)
|
|
|
+@{
|
|
|
+ return a+b;
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+Now we can give it a value:
|
|
|
+
|
|
|
+@example
|
|
|
+binary_op = double_add;
|
|
|
+@end example
|
|
|
+
|
|
|
+The target type of the function pointer must be upward compatible with
|
|
|
+the type of the function (@pxref{Compatible Types}).
|
|
|
+
|
|
|
+There is no need for @samp{&} in front of @code{double_add}.
|
|
|
+Using a function name such as @code{double_add} as an expression
|
|
|
+automatically converts it to the function's address, with the
|
|
|
+appropriate function pointer type. However, it is ok to use
|
|
|
+@samp{&} if you feel that is clearer:
|
|
|
+
|
|
|
+@example
|
|
|
+binary_op = &double_add;
|
|
|
+@end example
|
|
|
+
|
|
|
+@node Calling Function Pointers
|
|
|
+@subsection Calling Function Pointers
|
|
|
+@cindex calling function pointers
|
|
|
+@cindex function pointers, calling
|
|
|
+
|
|
|
+To call the function specified by a function pointer, just write the
|
|
|
+function pointer value in a function call. For instance, here's a
|
|
|
+call to the function @code{binary_op} points to:
|
|
|
+
|
|
|
+@example
|
|
|
+binary_op (x, 5)
|
|
|
+@end example
|
|
|
+
|
|
|
+Since the data type of @code{binary_op} explicitly specifies type
|
|
|
+@code{double} for the arguments, the call converts @code{x} and 5 to
|
|
|
+@code{double}.
|
|
|
+
|
|
|
+The call conceptually dereferences the pointer @code{binary_op} to
|
|
|
+``get'' the function it points to, and calls that function. If you
|
|
|
+wish, you can explicitly represent the derefence by writing the
|
|
|
+@code{*} operator:
|
|
|
+
|
|
|
+@example
|
|
|
+(*binary_op) (x, 5)
|
|
|
+@end example
|
|
|
+
|
|
|
+The @samp{*} reminds people reading the code that @code{binary_op} is
|
|
|
+a function pointer rather than the name of a specific function.
|
|
|
+
|
|
|
+@node The main Function
|
|
|
+@section The @code{main} Function
|
|
|
+@cindex @code{main} function
|
|
|
+@findex main
|
|
|
+
|
|
|
+Every complete executable program requires at least one function,
|
|
|
+called @code{main}, which is where execution begins. You do not have
|
|
|
+to explicitly declare @code{main}, though GNU C permits you to do so.
|
|
|
+Conventionally, @code{main} should be defined to follow one of these
|
|
|
+calling conventions:
|
|
|
+
|
|
|
+@example
|
|
|
+int main (void) @{@r{@dots{}}@}
|
|
|
+int main (int argc, char *argv[]) @{@r{@dots{}}@}
|
|
|
+int main (int argc, char *argv[], char *envp[]) @{@r{@dots{}}@}
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+Using @code{void} as the parameter list means that @code{main} does
|
|
|
+not use the arguments. You can write @code{char **argv} instead of
|
|
|
+@code{char *argv[]}, and likewise for @code{envp}, as the two
|
|
|
+constructs are equivalent.
|
|
|
+
|
|
|
+@ignore @c Not so at present
|
|
|
+Defining @code{main} in any other way generates a warning. Your
|
|
|
+program will still compile, but you may get unexpected results when
|
|
|
+executing it.
|
|
|
+@end ignore
|
|
|
+
|
|
|
+You can call @code{main} from C code, as you can call any other
|
|
|
+function, though that is an unusual thing to do. When you do that,
|
|
|
+you must write the call to pass arguments that match the parameters in
|
|
|
+the definition of @code{main}.
|
|
|
+
|
|
|
+The @code{main} function is not actually the first code that runs when
|
|
|
+a program starts. In fact, the first code that runs is system code
|
|
|
+from the file @file{crt0.o}. In Unix, this was hand-written assembler
|
|
|
+code, but in GNU we replaced it with C code. Its job is to find
|
|
|
+the arguments for @code{main} and call that.
|
|
|
+
|
|
|
+@menu
|
|
|
+* Values from main:: Returning values from the main function.
|
|
|
+* Command-line Parameters:: Accessing command-line parameters
|
|
|
+ provided to the program.
|
|
|
+* Environment Variables:: Accessing system environment variables.
|
|
|
+@end menu
|
|
|
+
|
|
|
+@node Values from main
|
|
|
+@subsection Returning Values from @code{main}
|
|
|
+@cindex returning values from @code{main}
|
|
|
+@cindex success
|
|
|
+@cindex failure
|
|
|
+@cindex exit status
|
|
|
+
|
|
|
+When @code{main} returns, the process terminates. Whatever value
|
|
|
+@code{main} returns becomes the exit status which is reported to the
|
|
|
+parent process. While nominally the return value is of type
|
|
|
+@code{int}, in fact the exit status gets truncated to eight bits; if
|
|
|
+@code{main} returns the value 256, the exit status is 0.
|
|
|
+
|
|
|
+Normally, programs return only one of two values: 0 for success,
|
|
|
+and 1 for failure. For maximum portability, use the macro
|
|
|
+values @code{EXIT_SUCCESS} and @code{EXIT_FAILURE} defined in
|
|
|
+@code{stdlib.h}. Here's an example:
|
|
|
+
|
|
|
+@cindex @code{EXIT_FAILURE}
|
|
|
+@cindex @code{EXIT_SUCCESS}
|
|
|
+@example
|
|
|
+#include <stdlib.h> /* @r{Defines @code{EXIT_SUCCESS}} */
|
|
|
+ /* @r{and @code{EXIT_FAILURE}.} */
|
|
|
+
|
|
|
+int
|
|
|
+main (void)
|
|
|
+@{
|
|
|
+ @r{@dots{}}
|
|
|
+ if (foo)
|
|
|
+ return EXIT_SUCCESS;
|
|
|
+ else
|
|
|
+ return EXIT_FAILURE;
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+Some types of programs maintain special conventions for various return
|
|
|
+values; for example, comparison programs including @code{cmp} and
|
|
|
+@code{diff} return 1 to indicate a mismatch, and 2 to indicate that
|
|
|
+the comparison couldn't be performed.
|
|
|
+
|
|
|
+@node Command-line Parameters
|
|
|
+@subsection Accessing Command-line Parameters
|
|
|
+@cindex command-line parameters
|
|
|
+@cindex parameters, command-line
|
|
|
+
|
|
|
+If the program was invoked with any command-line arguments, it can
|
|
|
+access them through the arguments of @code{main}, @code{argc} and
|
|
|
+@code{argv}. (You can give these arguments any names, but the names
|
|
|
+@code{argc} and @code{argv} are customary.)
|
|
|
+
|
|
|
+The value of @code{argv} is an array containing all of the
|
|
|
+command-line arguments as strings, with the name of the command
|
|
|
+invoked as the first string. @code{argc} is an integer that says how
|
|
|
+many strings @code{argv} contains. Here is an example of accessing
|
|
|
+the command-line parameters, retrieving the program's name and
|
|
|
+checking for the standard @option{--version} and @option{--help} options:
|
|
|
+
|
|
|
+@example
|
|
|
+#include <string.h> /* @r{Declare @code{strcmp}.} */
|
|
|
+
|
|
|
+int
|
|
|
+main (int argc, char *argv[])
|
|
|
+@{
|
|
|
+ char *program_name = argv[0];
|
|
|
+
|
|
|
+ for (int i = 1; i < argc; i++)
|
|
|
+ @{
|
|
|
+ if (!strcmp (argv[i], "--version"))
|
|
|
+ @{
|
|
|
+ /* @r{Print version information and exit.} */
|
|
|
+ @r{@dots{}}
|
|
|
+ @}
|
|
|
+ else if (!strcmp (argv[i], "--help"))
|
|
|
+ @{
|
|
|
+ /* @r{Print help information and exit.} */
|
|
|
+ @r{@dots{}}
|
|
|
+ @}
|
|
|
+ @}
|
|
|
+ @r{@dots{}}
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+@node Environment Variables
|
|
|
+@subsection Accessing Environment Variables
|
|
|
+@cindex environment variables
|
|
|
+
|
|
|
+You can optionally include a third parameter to @code{main}, another
|
|
|
+array of strings, to capture the environment variables available to
|
|
|
+the program. Unlike what happens with @code{argv}, there is no
|
|
|
+additional parameter for the count of environment variables; rather,
|
|
|
+the array of environment variables concludes with a null pointer.
|
|
|
+
|
|
|
+@example
|
|
|
+#include <stdio.h> /* @r{Declares @code{printf}.} */
|
|
|
+
|
|
|
+int
|
|
|
+main (int argc, char *argv[], char *envp[])
|
|
|
+@{
|
|
|
+ /* @r{Print out all environment variables.} */
|
|
|
+ int i = 0;
|
|
|
+ while (envp[i])
|
|
|
+ @{
|
|
|
+ printf ("%s\n", envp[i]);
|
|
|
+ i++;
|
|
|
+ @}
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+Another method of retrieving environment variables is to use the
|
|
|
+library function @code{getenv}, which is defined in @code{stdlib.h}.
|
|
|
+Using @code{getenv} does not require defining @code{main} to accept the
|
|
|
+@code{envp} pointer. For example, here is a program that fetches and prints
|
|
|
+the user's home directory (if defined):
|
|
|
+
|
|
|
+@example
|
|
|
+#include <stdlib.h> /* @r{Declares @code{getenv}.} */
|
|
|
+#include <stdio.h> /* @r{Declares @code{printf}.} */
|
|
|
+
|
|
|
+int
|
|
|
+main (void)
|
|
|
+@{
|
|
|
+ char *home_directory = getenv ("HOME");
|
|
|
+ if (home_directory)
|
|
|
+ printf ("My home directory is: %s\n", home_directory);
|
|
|
+ else
|
|
|
+ printf ("My home directory is not defined!\n");
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+@node Advanced Definitions
|
|
|
+@section Advanced Function Features
|
|
|
+
|
|
|
+This section describes some advanced or obscure features for GNU C
|
|
|
+function definitions. If you are just learning C, you can skip the
|
|
|
+rest of this chapter.
|
|
|
+
|
|
|
+@menu
|
|
|
+* Variable-Length Array Parameters:: Functions that accept arrays
|
|
|
+ of variable length.
|
|
|
+* Variable Number of Arguments:: Variadic functions.
|
|
|
+* Nested Functions:: Defining functions within functions.
|
|
|
+* Inline Function Definitions:: A function call optimization technique.
|
|
|
+@end menu
|
|
|
+
|
|
|
+@node Variable-Length Array Parameters
|
|
|
+@subsection Variable-Length Array Parameters
|
|
|
+@cindex variable-length array parameters
|
|
|
+@cindex array parameters, variable-length
|
|
|
+@cindex functions that accept variable-length arrays
|
|
|
+
|
|
|
+An array parameter can have variable length: simply declare the array
|
|
|
+type with a size that isn't constant. In a nested function, the
|
|
|
+length can refer to a variable defined in a containing scope. In any
|
|
|
+function, it can refer to a previous parameter, like this:
|
|
|
+
|
|
|
+@example
|
|
|
+struct entry
|
|
|
+tester (int len, char data[len][len])
|
|
|
+@{
|
|
|
+ @r{@dots{}}
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+Alternatively, in function declarations (but not in function
|
|
|
+definitions), you can use @code{[*]} to denote that the array
|
|
|
+parameter is of a variable length, such that these two declarations
|
|
|
+mean the same thing:
|
|
|
+
|
|
|
+@example
|
|
|
+struct entry
|
|
|
+tester (int len, char data[len][len]);
|
|
|
+@end example
|
|
|
+
|
|
|
+@example
|
|
|
+struct entry
|
|
|
+tester (int len, char data[*][*]);
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+The two forms of input are equivalent in GNU C, but emphasizing that
|
|
|
+the array parameter is variable-length may be helpful to those
|
|
|
+studying the code.
|
|
|
+
|
|
|
+You can also omit the length parameter, and instead use some other
|
|
|
+in-scope variable for the length in the function definition:
|
|
|
+
|
|
|
+@example
|
|
|
+struct entry
|
|
|
+tester (char data[*][*]);
|
|
|
+@r{@dots{}}
|
|
|
+int dataLength = 20;
|
|
|
+@r{@dots{}}
|
|
|
+struct entry
|
|
|
+tester (char data[dataLength][dataLength])
|
|
|
+@{
|
|
|
+ @r{@dots{}}
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+@c ??? check text above
|
|
|
+
|
|
|
+@cindex parameter forward declaration
|
|
|
+In GNU C, to pass the array first and the length afterward, you can
|
|
|
+use a @dfn{parameter forward declaration}, like this:
|
|
|
+
|
|
|
+@example
|
|
|
+struct entry
|
|
|
+tester (int len; char data[len][len], int len)
|
|
|
+@{
|
|
|
+ @r{@dots{}}
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+The @samp{int len} before the semicolon is the parameter forward
|
|
|
+declaration; it serves the purpose of making the name @code{len} known
|
|
|
+when the declaration of @code{data} is parsed.
|
|
|
+
|
|
|
+You can write any number of such parameter forward declarations in the
|
|
|
+parameter list. They can be separated by commas or semicolons, but
|
|
|
+the last one must end with a semicolon, which is followed by the
|
|
|
+``real'' parameter declarations. Each forward declaration must match
|
|
|
+a subsequent ``real'' declaration in parameter name and data type.
|
|
|
+
|
|
|
+Standard C does not support parameter forward declarations.
|
|
|
+
|
|
|
+@node Variable Number of Arguments
|
|
|
+@subsection Variable-Length Parameter Lists
|
|
|
+@cindex variable-length parameter lists
|
|
|
+@cindex parameters lists, variable length
|
|
|
+@cindex function parameter lists, variable length
|
|
|
+
|
|
|
+@cindex variadic function
|
|
|
+A function that takes a variable number of arguments is called a
|
|
|
+@dfn{variadic function}. In C, a variadic function must specify at
|
|
|
+least one fixed argument with an explicitly declared data type.
|
|
|
+Additional arguments can follow, and can vary in both quantity and
|
|
|
+data type.
|
|
|
+
|
|
|
+In the function header, declare the fixed parameters in the normal
|
|
|
+way, then write a comma and an ellipsis: @samp{, ...}. Here is an
|
|
|
+example of a variadic function header:
|
|
|
+
|
|
|
+@example
|
|
|
+int add_multiple_values (int number, ...)
|
|
|
+@end example
|
|
|
+
|
|
|
+@cindex @code{va_list}
|
|
|
+@cindex @code{va_start}
|
|
|
+@cindex @code{va_end}
|
|
|
+The function body can refer to fixed arguments by their parameter
|
|
|
+names, but the additional arguments have no names. Accessing them in
|
|
|
+the function body uses certain standard macros. They are defined in
|
|
|
+the library header file @file{stdarg.h}, so the code must
|
|
|
+@code{#include} that file.
|
|
|
+
|
|
|
+In the body, write
|
|
|
+
|
|
|
+@example
|
|
|
+va_list ap;
|
|
|
+va_start (ap, @var{last_fixed_parameter});
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+This declares the variable @code{ap} (you can use any name for it)
|
|
|
+and then sets it up to point before the first additional argument.
|
|
|
+
|
|
|
+Then, to fetch the next consecutive additional argument, write this:
|
|
|
+
|
|
|
+@example
|
|
|
+va_arg (ap, @var{type})
|
|
|
+@end example
|
|
|
+
|
|
|
+After fetching all the additional arguments (or as many as need to be
|
|
|
+used), write this:
|
|
|
+
|
|
|
+@example
|
|
|
+va_end (ap);
|
|
|
+@end example
|
|
|
+
|
|
|
+Here's an example of a variadic function definition that adds any
|
|
|
+number of @code{int} arguments. The first (fixed) argument says how
|
|
|
+many more arguments follow.
|
|
|
+
|
|
|
+@example
|
|
|
+#include <stdarg.h> /* @r{Defines @code{va}@r{@dots{}} macros.} */
|
|
|
+@r{@dots{}}
|
|
|
+
|
|
|
+int
|
|
|
+add_multiple_values (int argcount, ...)
|
|
|
+@{
|
|
|
+ int counter, total = 0;
|
|
|
+
|
|
|
+ /* @r{Declare a variable of type @code{va_list}.} */
|
|
|
+ va_list argptr;
|
|
|
+
|
|
|
+ /* @r{Initialize that variable..} */
|
|
|
+ va_start (argptr, argcount);
|
|
|
+
|
|
|
+ for (counter = 0; counter < argcount; counter++)
|
|
|
+ @{
|
|
|
+ /* @r{Get the next additional argument.} */
|
|
|
+ total += va_arg (argptr, int);
|
|
|
+ @}
|
|
|
+
|
|
|
+ /* @r{End use of the @code{argptr} variable.} */
|
|
|
+ va_end (argptr);
|
|
|
+
|
|
|
+ return total;
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+With GNU C, @code{va_end} is superfluous, but some other compilers
|
|
|
+might make @code{va_start} allocate memory so that calling
|
|
|
+@code{va_end} is necessary to avoid a memory leak. Before doing
|
|
|
+@code{va_start} again with the same variable, do @code{va_end}
|
|
|
+first.
|
|
|
+
|
|
|
+@cindex @code{va_copy}
|
|
|
+Because of this possible memory allocation, it is risky (in principle)
|
|
|
+to copy one @code{va_list} variable to another with assignment.
|
|
|
+Instead, use @code{va_copy}, which copies the substance but allocates
|
|
|
+separate memory in the variable you copy to. The call looks like
|
|
|
+@code{va_copy (@var{to}, @var{from})}, where both @var{to} and
|
|
|
+@var{from} should be variables of type @code{va_list}. In principle,
|
|
|
+do @code{va_end} on each of these variables before its scope ends.
|
|
|
+
|
|
|
+Since the additional arguments' types are not specified in the
|
|
|
+function's definition, the default argument promotions
|
|
|
+(@pxref{Argument Promotions}) apply to them in function calls. The
|
|
|
+function definition must take account of this; thus, if an argument
|
|
|
+was passed as @code{short}, the function should get it as @code{int}.
|
|
|
+If an argument was passed as @code{float}, the function should get it
|
|
|
+as @code{double}.
|
|
|
+
|
|
|
+C has no mechanism to tell the variadic function how many arguments
|
|
|
+were passed to it, so its calling convention must give it a way to
|
|
|
+determine this. That's why @code{add_multiple_values} takes a fixed
|
|
|
+argument that says how many more arguments follow. Thus, you can
|
|
|
+call the function like this:
|
|
|
+
|
|
|
+@example
|
|
|
+sum = add_multiple_values (3, 12, 34, 190);
|
|
|
+/* @r{Value is 12+34+190.} */
|
|
|
+@end example
|
|
|
+
|
|
|
+In GNU C, there is no actual need to use the @code{va_end} function.
|
|
|
+In fact, it does nothing. It's used for compatibility with other
|
|
|
+compilers, when that matters.
|
|
|
+
|
|
|
+It is a mistake to access variables declared as @code{va_list} except
|
|
|
+in the specific ways described here. Just what that type consists of
|
|
|
+is an implementation detail, which could vary from one platform to
|
|
|
+another.
|
|
|
+
|
|
|
+@node Nested Functions
|
|
|
+@subsection Nested Functions
|
|
|
+@cindex nested functions
|
|
|
+@cindex functions, nested
|
|
|
+@cindex downward funargs
|
|
|
+@cindex thunks
|
|
|
+
|
|
|
+A @dfn{nested function} is a function defined inside another function.
|
|
|
+The nested function's name is local to the block where it is defined.
|
|
|
+For example, here we define a nested function named @code{square}, and
|
|
|
+call it twice:
|
|
|
+
|
|
|
+@example
|
|
|
+@group
|
|
|
+foo (double a, double b)
|
|
|
+@{
|
|
|
+ double square (double z) @{ return z * z; @}
|
|
|
+
|
|
|
+ return square (a) + square (b);
|
|
|
+@}
|
|
|
+@end group
|
|
|
+@end example
|
|
|
+
|
|
|
+The nested function can access all the variables of the containing
|
|
|
+function that are visible at the point of its definition. This is
|
|
|
+called @dfn{lexical scoping}. For example, here we show a nested
|
|
|
+function that uses an inherited variable named @code{offset}:
|
|
|
+
|
|
|
+@example
|
|
|
+@group
|
|
|
+bar (int *array, int offset, int size)
|
|
|
+@{
|
|
|
+ int access (int *array, int index)
|
|
|
+ @{ return array[index + offset]; @}
|
|
|
+ int i;
|
|
|
+ @r{@dots{}}
|
|
|
+ for (i = 0; i < size; i++)
|
|
|
+ @r{@dots{}} access (array, i) @r{@dots{}}
|
|
|
+@}
|
|
|
+@end group
|
|
|
+@end example
|
|
|
+
|
|
|
+Nested function definitions can appear wherever automatic variable
|
|
|
+declarations are allowed; that is, in any block, interspersed with the
|
|
|
+other declarations and statements in the block.
|
|
|
+
|
|
|
+The nested function's name is visible only within the parent block;
|
|
|
+the name's scope starts from its definition and continues to the end
|
|
|
+of the containing block. If the nested function's name
|
|
|
+is the same as the parent function's name, there wil be
|
|
|
+no way to refer to the parent function inside the scope of the
|
|
|
+name of the nested function.
|
|
|
+
|
|
|
+Using @code{extern} or @code{static} on a nested function definition
|
|
|
+is an error.
|
|
|
+
|
|
|
+It is possible to call the nested function from outside the scope of its
|
|
|
+name by storing its address or passing the address to another function.
|
|
|
+You can do this safely, but you must be careful:
|
|
|
+
|
|
|
+@example
|
|
|
+@group
|
|
|
+hack (int *array, int size, int addition)
|
|
|
+@{
|
|
|
+ void store (int index, int value)
|
|
|
+ @{ array[index] = value + addition; @}
|
|
|
+
|
|
|
+ intermediate (store, size);
|
|
|
+@}
|
|
|
+@end group
|
|
|
+@end example
|
|
|
+
|
|
|
+Here, the function @code{intermediate} receives the address of
|
|
|
+@code{store} as an argument. If @code{intermediate} calls @code{store},
|
|
|
+the arguments given to @code{store} are used to store into @code{array}.
|
|
|
+@code{store} also accesses @code{hack}'s local variable @code{addition}.
|
|
|
+
|
|
|
+It is safe for @code{intermediate} to call @code{store} because
|
|
|
+@code{hack}'s stack frame, with its arguments and local variables,
|
|
|
+continues to exist during the call to @code{intermediate}.
|
|
|
+
|
|
|
+Calling the nested function through its address after the containing
|
|
|
+function has exited is asking for trouble. If it is called after a
|
|
|
+containing scope level has exited, and if it refers to some of the
|
|
|
+variables that are no longer in scope, it will refer to memory
|
|
|
+containing junk or other data. It's not wise to take the risk.
|
|
|
+
|
|
|
+The GNU C Compiler implements taking the address of a nested function
|
|
|
+using a technique called @dfn{trampolines}. This technique was
|
|
|
+described in @cite{Lexical Closures for C@t{++}} (Thomas M. Breuel,
|
|
|
+USENIX C@t{++} Conference Proceedings, October 17--21, 1988).
|
|
|
+
|
|
|
+A nested function can jump to a label inherited from a containing
|
|
|
+function, provided the label was explicitly declared in the containing
|
|
|
+function (@pxref{Local Labels}). Such a jump returns instantly to the
|
|
|
+containing function, exiting the nested function that did the
|
|
|
+@code{goto} and any intermediate function invocations as well. Here
|
|
|
+is an example:
|
|
|
+
|
|
|
+@example
|
|
|
+@group
|
|
|
+bar (int *array, int offset, int size)
|
|
|
+@{
|
|
|
+ /* @r{Explicitly declare the label @code{failure}.} */
|
|
|
+ __label__ failure;
|
|
|
+ int access (int *array, int index)
|
|
|
+ @{
|
|
|
+ if (index > size)
|
|
|
+ /* @r{Exit this function,}
|
|
|
+ @r{and return to @code{bar}.} */
|
|
|
+ goto failure;
|
|
|
+ return array[index + offset];
|
|
|
+ @}
|
|
|
+@end group
|
|
|
+
|
|
|
+@group
|
|
|
+ int i;
|
|
|
+ @r{@dots{}}
|
|
|
+ for (i = 0; i < size; i++)
|
|
|
+ @r{@dots{}} access (array, i) @r{@dots{}}
|
|
|
+ @r{@dots{}}
|
|
|
+ return 0;
|
|
|
+
|
|
|
+ /* @r{Control comes here from @code{access}
|
|
|
+ if it does the @code{goto}.} */
|
|
|
+ failure:
|
|
|
+ return -1;
|
|
|
+@}
|
|
|
+@end group
|
|
|
+@end example
|
|
|
+
|
|
|
+To declare the nested function before its definition, use
|
|
|
+@code{auto} (which is otherwise meaningless for function declarations;
|
|
|
+@pxref{auto and register}). For example,
|
|
|
+
|
|
|
+@example
|
|
|
+bar (int *array, int offset, int size)
|
|
|
+@{
|
|
|
+ auto int access (int *, int);
|
|
|
+ @r{@dots{}}
|
|
|
+ @r{@dots{}} access (array, i) @r{@dots{}}
|
|
|
+ @r{@dots{}}
|
|
|
+ int access (int *array, int index)
|
|
|
+ @{
|
|
|
+ @r{@dots{}}
|
|
|
+ @}
|
|
|
+ @r{@dots{}}
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+@node Inline Function Definitions
|
|
|
+@subsection Inline Function Definitions
|
|
|
+@cindex inline function definitions
|
|
|
+@cindex function definitions, inline
|
|
|
+@findex inline
|
|
|
+
|
|
|
+To declare a function inline, use the @code{inline} keyword in its
|
|
|
+definition. Here's a simple function that takes a pointer-to-@code{int}
|
|
|
+and increments the integer stored there---declared inline.
|
|
|
+
|
|
|
+@example
|
|
|
+struct list
|
|
|
+@{
|
|
|
+ struct list *first, *second;
|
|
|
+@};
|
|
|
+
|
|
|
+inline struct list *
|
|
|
+list_first (struct list *p)
|
|
|
+@{
|
|
|
+ return p->first;
|
|
|
+@}
|
|
|
+
|
|
|
+inline struct list *
|
|
|
+list_second (struct list *p)
|
|
|
+@{
|
|
|
+ return p->second;
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+optimized compilation can substitute the inline function's body for
|
|
|
+any call to it. This is called @emph{inlining} the function. It
|
|
|
+makes the code that contains the call run faster, significantly so if
|
|
|
+the inline function is small.
|
|
|
+
|
|
|
+Here's a function that uses @code{pair_second}:
|
|
|
+
|
|
|
+@example
|
|
|
+int
|
|
|
+pairlist_length (struct list *l)
|
|
|
+@{
|
|
|
+ int length = 0;
|
|
|
+ while (l)
|
|
|
+ @{
|
|
|
+ length++;
|
|
|
+ l = pair_second (l);
|
|
|
+ @}
|
|
|
+ return length;
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+Substituting the code of @code{pair_second} into the definition of
|
|
|
+@code{pairlist_length} results in this code, in effect:
|
|
|
+
|
|
|
+@example
|
|
|
+int
|
|
|
+pairlist_length (struct list *l)
|
|
|
+@{
|
|
|
+ int length = 0;
|
|
|
+ while (l)
|
|
|
+ @{
|
|
|
+ length++;
|
|
|
+ l = l->second;
|
|
|
+ @}
|
|
|
+ return length;
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+Since the definition of @code{pair_second} does not say @code{extern}
|
|
|
+or @code{static}, that definition is used only for inlining. It
|
|
|
+doesn't generate code that can be called at run time. If not all the
|
|
|
+calls to the function are inlined, there must be a definition of the
|
|
|
+same function name in another module for them to call.
|
|
|
+
|
|
|
+@cindex inline functions, omission of
|
|
|
+@c @opindex fkeep-inline-functions
|
|
|
+Adding @code{static} to an inline function definition means the
|
|
|
+function definition is limited to this compilation module. Also, it
|
|
|
+generates run-time code if necessary for the sake of any calls that
|
|
|
+were not inlined. If all calls are inlined then the function
|
|
|
+definition does not generate run-time code, but you can force
|
|
|
+generation of run-time code with the option
|
|
|
+@option{-fkeep-inline-functions}.
|
|
|
+
|
|
|
+@cindex extern inline function
|
|
|
+Specifying @code{extern} along with @code{inline} means the function is
|
|
|
+external and generates run-time code to be called from other
|
|
|
+separately compiled modules, as well as inlined. You can define the
|
|
|
+function as @code{inline} without @code{extern} in other modules so as
|
|
|
+to inline calls to the same function in those modules.
|
|
|
+
|
|
|
+Why are some calls not inlined? First of all, inlining is an
|
|
|
+optimization, so non-optimized compilation does not inline.
|
|
|
+
|
|
|
+Some calls cannot be inlined for technical reasons. Also, certain
|
|
|
+usages in a function definition can make it unsuitable for inline
|
|
|
+substitution. Among these usages are: variadic functions, use of
|
|
|
+@code{alloca}, use of computed goto (@pxref{Labels as Values}), and
|
|
|
+use of nonlocal goto. The option @option{-Winline} requests a warning
|
|
|
+when a function marked @code{inline} is unsuitable to be inlined. The
|
|
|
+warning explains what obstacle makes it unsuitable.
|
|
|
+
|
|
|
+Just because a call @emph{can} be inlined does not mean it
|
|
|
+@emph{should} be inlined. The GNU C compiler weighs costs and
|
|
|
+benefits to decide whether inlining a particular call is advantageous.
|
|
|
+
|
|
|
+You can force inlining of all calls to a given function that can be
|
|
|
+inlined, even in a non-optimized compilation. by specifying the
|
|
|
+@samp{always_inline} attribute for the function, like this:
|
|
|
+
|
|
|
+@example
|
|
|
+/* @r{Prototype.} */
|
|
|
+inline void foo (const char) __attribute__((always_inline));
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+This is a GNU C extension. @xref{Attributes}.
|
|
|
+
|
|
|
+A function call may be inlined even if not declared @code{inline} in
|
|
|
+special cases where the compiler can determine this is correct and
|
|
|
+desirable. For instance, when a static function is called only once,
|
|
|
+it will very likely be inlined. With @option{-flto}, link-time
|
|
|
+optimization, any function might be inlined. To absolutely prevent
|
|
|
+inlining of a specific function, specify
|
|
|
+@code{__attribute__((__noinline__))} in the function's definition.
|
|
|
+
|
|
|
+@node Obsolete Definitions
|
|
|
+@section Obsolete Function Features
|
|
|
+
|
|
|
+These features of function definitions are still used in old
|
|
|
+programs, but you shouldn't write code this way today.
|
|
|
+If you are just learning C, you can skip this section.
|
|
|
+
|
|
|
+@menu
|
|
|
+* Old GNU Inlining:: An older inlining technique.
|
|
|
+* Old-Style Function Definitions:: Original K&R style functions.
|
|
|
+@end menu
|
|
|
+
|
|
|
+@node Old GNU Inlining
|
|
|
+@subsection Older GNU C Inlining
|
|
|
+
|
|
|
+The GNU C spec for inline functions, before GCC version 5, defined
|
|
|
+@code{extern inline} on a function definition to mean to inline calls
|
|
|
+to it but @emph{not} generate code for the function that could be
|
|
|
+called at run time. By contrast, @code{inline} without @code{extern}
|
|
|
+specified to generate run-time code for the function. In effect, ISO
|
|
|
+incompatibly flipped the meanings of these two cases. We changed GCC
|
|
|
+in version 5 to adopt the ISO specification.
|
|
|
+
|
|
|
+Many programs still use these cases with the previous GNU C meanings.
|
|
|
+You can specify use of those meanings with the option
|
|
|
+@option{-fgnu89-inline}. You can also specify this for a single
|
|
|
+function with @code{__attribute__ ((gnu_inline))}. Here's an example:
|
|
|
+
|
|
|
+@example
|
|
|
+inline __attribute__ ((gnu_inline))
|
|
|
+int
|
|
|
+inc (int *a)
|
|
|
+@{
|
|
|
+ (*a)++;
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+@node Old-Style Function Definitions
|
|
|
+@subsection Old-Style Function Definitions
|
|
|
+@cindex old-style function definitions
|
|
|
+@cindex function definitions, old-style
|
|
|
+@cindex K&R-style function definitions
|
|
|
+
|
|
|
+The syntax of C traditionally allows omitting the data type in a
|
|
|
+function declaration if it specifies a storage class or a qualifier.
|
|
|
+Then the type defaults to @code{int}. For example:
|
|
|
+
|
|
|
+@example
|
|
|
+static foo (double x);
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+defaults the return type to @code{int}. This is bad practice; if you
|
|
|
+see it, fix it.
|
|
|
+
|
|
|
+An @dfn{old-style} (or ``K&R'') function definition is the way
|
|
|
+function definitions were written in the 1980s. It looks like this:
|
|
|
+
|
|
|
+@example
|
|
|
+@var{rettype}
|
|
|
+@var{function} (@var{parmnames})
|
|
|
+ @var{parm_declarations}
|
|
|
+@{
|
|
|
+ @var{body}
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+In @var{parmnames}, only the parameter names are listed, separated by
|
|
|
+commas. Then @var{parm_declarations} declares their data types; these
|
|
|
+declarations look just like variable declarations. If a parameter is
|
|
|
+listed in @var{parmnames} but has no declaration, it is implicitly
|
|
|
+declared @code{int}.
|
|
|
+
|
|
|
+There is no reason to write a definition this way nowadays, but they
|
|
|
+can still be seen in older GNU programs.
|
|
|
+
|
|
|
+An old-style variadic function definition looks like this:
|
|
|
+
|
|
|
+@example
|
|
|
+#include <varargs.h>
|
|
|
+
|
|
|
+int
|
|
|
+add_multiple_values (va_alist)
|
|
|
+ va_dcl
|
|
|
+@{
|
|
|
+ int argcount;
|
|
|
+ int counter, total = 0;
|
|
|
+
|
|
|
+ /* @r{Declare a variable of type @code{va_list}.} */
|
|
|
+ va_list argptr;
|
|
|
+
|
|
|
+ /* @r{Initialize that variable.} */
|
|
|
+ va_start (argptr);
|
|
|
+
|
|
|
+ /* @r{Get the first argument (fixed).} */
|
|
|
+ argcount = va_arg (int);
|
|
|
+
|
|
|
+ for (counter = 0; counter < argcount; counter++)
|
|
|
+ @{
|
|
|
+ /* @r{Get the next additional argument.} */
|
|
|
+ total += va_arg (argptr, int);
|
|
|
+ @}
|
|
|
+
|
|
|
+ /* @r{End use of the @code{argptr} variable.} */
|
|
|
+ va_end (argptr);
|
|
|
+
|
|
|
+ return total;
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+Note that the old-style variadic function definition has no fixed
|
|
|
+parameter variables; all arguments must be obtained with
|
|
|
+@code{va_arg}.
|
|
|
+
|
|
|
+@node Compatible Types
|
|
|
+@chapter Compatible Types
|
|
|
+@cindex compatible types
|
|
|
+@cindex types, compatible
|
|
|
+
|
|
|
+Declaring a function or variable twice is valid in C only if the two
|
|
|
+declarations specify @dfn{compatible} types. In addition, some
|
|
|
+operations on pointers require operands to have compatible target
|
|
|
+types.
|
|
|
+
|
|
|
+In C, two different primitive types are never compatible. Likewise for
|
|
|
+the defined types @code{struct}, @code{union} and @code{enum}: two
|
|
|
+separately defined types are incompatible unless they are defined
|
|
|
+exactly the same way.
|
|
|
+
|
|
|
+However, there are a few cases where different types can be
|
|
|
+compatible:
|
|
|
+
|
|
|
+@itemize @bullet
|
|
|
+@item
|
|
|
+Every enumeration type is compatible with some integer type. In GNU
|
|
|
+C, the choice of integer type depends on the largest enumeration
|
|
|
+value.
|
|
|
+
|
|
|
+@c ??? Which one, in GCC?
|
|
|
+@c ??? ... it varies, depending on the enum values. Testing on
|
|
|
+@c ??? fencepost, it appears to use a 4-byte signed integer first,
|
|
|
+@c ??? then moves on to an 8-byte signed integer. These details
|
|
|
+@c ??? might be platform-dependent, as the C standard says that even
|
|
|
+@c ??? char could be used as an enum type, but it's at least true
|
|
|
+@c ??? that GCC chooses a type that is at least large enough to
|
|
|
+@c ??? hold the largest enum value.
|
|
|
+
|
|
|
+@item
|
|
|
+Array types are compatible if the element types are compatible
|
|
|
+and the sizes (when specified) match.
|
|
|
+
|
|
|
+@item
|
|
|
+Pointer types are compatible if the pointer target types are
|
|
|
+compatible.
|
|
|
+
|
|
|
+@item
|
|
|
+Function types that specify argument types are compatible if the
|
|
|
+return types are compatible and the argument types are compatible,
|
|
|
+argument by argument. In addition, they must all agree in whether
|
|
|
+they use @code{...} to allow additional arguments.
|
|
|
+
|
|
|
+@item
|
|
|
+Function types that don't specify argument types are compatible if the
|
|
|
+return types are.
|
|
|
+
|
|
|
+@item
|
|
|
+Function types that specify the argument types are compatible with
|
|
|
+function types that omit them, if the return types are compatible and
|
|
|
+the specified argument types are unaltered by the argument promotions
|
|
|
+(@pxref{Argument Promotions}).
|
|
|
+@end itemize
|
|
|
+
|
|
|
+In order for types to be compatible, they must agree in their type
|
|
|
+qualifiers. Thus, @code{const int} and @code{int} are incompatible.
|
|
|
+It follows that @code{const int *} and @code{int *} are incompatible
|
|
|
+too (they are pointers to types that are not compatible).
|
|
|
+
|
|
|
+If two types are compatible ignoring the qualifiers, we call them
|
|
|
+@dfn{nearly compatible}. (If they are array types, we ignore
|
|
|
+qualifiers on the element types.@footnote{This is a GNU C extension.})
|
|
|
+Comparison of pointers is valid if the pointers' target types are
|
|
|
+nearly compatible. Likewise, the two branches of a conditional
|
|
|
+expression may be pointers to nearly compatible target types.
|
|
|
+
|
|
|
+If two types are compatible ignoring the qualifiers, and the first
|
|
|
+type has all the qualifiers of the second type, we say the first is
|
|
|
+@dfn{upward compatible} with the second. Assignment of pointers
|
|
|
+requires the assigned pointer's target type to be upward compatible
|
|
|
+with the right operand (the new value)'s target type.
|
|
|
+
|
|
|
+@node Type Conversions
|
|
|
+@chapter Type Conversions
|
|
|
+@cindex type conversions
|
|
|
+@cindex conversions, type
|
|
|
+
|
|
|
+C converts between data types automatically when that seems clearly
|
|
|
+necessary. In addition, you can convert explicitly with a @dfn{cast}.
|
|
|
+
|
|
|
+@menu
|
|
|
+* Explicit Type Conversion:: Casting a value from one type to another.
|
|
|
+* Assignment Type Conversions:: Automatic conversion by assignment operation.
|
|
|
+* Argument Promotions:: Automatic conversion of function parameters.
|
|
|
+* Operand Promotions:: Automatic conversion of arithmetic operands.
|
|
|
+* Common Type:: When operand types differ, which one is used?
|
|
|
+@end menu
|
|
|
+
|
|
|
+@node Explicit Type Conversion
|
|
|
+@section Explicit Type Conversion
|
|
|
+@cindex cast
|
|
|
+@cindex explicit type conversion
|
|
|
+
|
|
|
+You can do explicit conversions using the unary @dfn{cast} operator,
|
|
|
+which is written as a type designator (@pxref{Type Designators}) in
|
|
|
+parentheses. For example, @code{(int)} is the operator to cast to
|
|
|
+type @code{int}. Here's an example of using it:
|
|
|
+
|
|
|
+@example
|
|
|
+@{
|
|
|
+ double d = 5.5;
|
|
|
+
|
|
|
+ printf ("Floating point value: %f\n", d);
|
|
|
+ printf ("Rounded to integer: %d\n", (int) d);
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+Using @code{(int) d} passes an @code{int} value as argument to
|
|
|
+@code{printf}, so you can print it with @samp{%d}. Using just
|
|
|
+@code{d} without the cast would pass the value as @code{double}.
|
|
|
+That won't work at all with @samp{%d}; the results would be gibberish.
|
|
|
+
|
|
|
+To divide one integer by another without rounding,
|
|
|
+cast either of the integers to @code{double} first:
|
|
|
+
|
|
|
+@example
|
|
|
+(double) @var{dividend} / @var{divisor}
|
|
|
+@var{dividend} / (double) @var{divisor}
|
|
|
+@end example
|
|
|
+
|
|
|
+It is enough to cast one of them, because that forces the common type
|
|
|
+to @code{double} so the other will be converted automatically.
|
|
|
+
|
|
|
+The valid cast conversions are:
|
|
|
+
|
|
|
+@itemize @bullet
|
|
|
+@item
|
|
|
+One numerical type to another.
|
|
|
+
|
|
|
+@item
|
|
|
+One pointer type to another.
|
|
|
+(Converting between pointers that point to functions
|
|
|
+and pointers that point to data is not standard C.)
|
|
|
+
|
|
|
+@item
|
|
|
+A pointer type to an integer type.
|
|
|
+
|
|
|
+@item
|
|
|
+An integer type to a pointer type.
|
|
|
+
|
|
|
+@item
|
|
|
+To a union type, from the type of any alternative in the union
|
|
|
+(@pxref{Unions}). (This is a GNU extension.)
|
|
|
+
|
|
|
+@item
|
|
|
+Anything, to @code{void}.
|
|
|
+@end itemize
|
|
|
+
|
|
|
+@node Assignment Type Conversions
|
|
|
+@section Assignment Type Conversions
|
|
|
+@cindex assignment type conversions
|
|
|
+
|
|
|
+Certain type conversions occur automatically in assignments
|
|
|
+and certain other contexts. These are the conversions
|
|
|
+assignments can do:
|
|
|
+
|
|
|
+@itemize @bullet
|
|
|
+@item
|
|
|
+Converting any numeric type to any other numeric type.
|
|
|
+
|
|
|
+@item
|
|
|
+Converting @code{void *} to any other pointer type
|
|
|
+(except pointer-to-function types).
|
|
|
+
|
|
|
+@item
|
|
|
+Converting any other pointer type to @code{void *}.
|
|
|
+(except pointer-to-function types).
|
|
|
+
|
|
|
+@item
|
|
|
+Converting 0 (a null pointer constant) to any pointer type.
|
|
|
+
|
|
|
+@item
|
|
|
+Converting any pointer type to @code{bool}. (The result is
|
|
|
+1 if the pointer is not null.)
|
|
|
+
|
|
|
+@item
|
|
|
+Converting between pointer types when the left-hand target type is
|
|
|
+upward compatible with the right-hand target type. @xref{Compatible
|
|
|
+Types}.
|
|
|
+@end itemize
|
|
|
+
|
|
|
+These type conversions occur automatically in certain contexts,
|
|
|
+which are:
|
|
|
+
|
|
|
+@itemize @bullet
|
|
|
+@item
|
|
|
+An assignment converts the type of the right-hand expression
|
|
|
+to the type wanted by the left-hand expression. For example,
|
|
|
+
|
|
|
+@example
|
|
|
+double i;
|
|
|
+i = 5;
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+converts 5 to @code{double}.
|
|
|
+
|
|
|
+@item
|
|
|
+A function call, when the function specifies the type for that
|
|
|
+argument, converts the argument value to that type. For example,
|
|
|
+
|
|
|
+@example
|
|
|
+void foo (double);
|
|
|
+foo (5);
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+converts 5 to @code{double}.
|
|
|
+
|
|
|
+@item
|
|
|
+A @code{return} statement converts the specified value to the type
|
|
|
+that the function is declared to return. For example,
|
|
|
+
|
|
|
+@example
|
|
|
+double
|
|
|
+foo ()
|
|
|
+@{
|
|
|
+ return 5;
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+also converts 5 to @code{double}.
|
|
|
+@end itemize
|
|
|
+
|
|
|
+In all three contexts, if the conversion is impossible, that
|
|
|
+constitutes an error.
|
|
|
+
|
|
|
+@node Argument Promotions
|
|
|
+@section Argument Promotions
|
|
|
+@cindex argument promotions
|
|
|
+@cindex promotion of arguments
|
|
|
+
|
|
|
+When a function's definition or declaration does not specify the type
|
|
|
+of an argument, that argument is passed without conversion in whatever
|
|
|
+type it has, with these exceptions:
|
|
|
+
|
|
|
+@itemize @bullet
|
|
|
+@item
|
|
|
+Some narrow numeric values are @dfn{promoted} to a wider type. If the
|
|
|
+expression is a narrow integer, such as @code{char} or @code{short},
|
|
|
+the call converts it automatically to @code{int} (@pxref{Integer
|
|
|
+Types}).@footnote{On an embedded controller where @code{char}
|
|
|
+or @code{short} is the same width as @code{int}, @code{unsigned char}
|
|
|
+or @code{unsigned short} promotes to @code{unsigned int}, but that
|
|
|
+never occurs in GNU C on real computers.}
|
|
|
+
|
|
|
+In this example, the expression @code{c} is passed as an @code{int}:
|
|
|
+
|
|
|
+@example
|
|
|
+char c = '$';
|
|
|
+
|
|
|
+printf ("Character c is '%c'\n", c);
|
|
|
+@end example
|
|
|
+
|
|
|
+@item
|
|
|
+If the expression
|
|
|
+has type @code{float}, the call converts it automatically to
|
|
|
+@code{double}.
|
|
|
+
|
|
|
+@item
|
|
|
+An array as argument is converted to a pointer to its zeroth element.
|
|
|
+
|
|
|
+@item
|
|
|
+A function name as argument is converted to a pointer to that function.
|
|
|
+@end itemize
|
|
|
+
|
|
|
+@node Operand Promotions
|
|
|
+@section Operand Promotions
|
|
|
+@cindex operand promotions
|
|
|
+
|
|
|
+The operands in arithmetic operations undergo type conversion automatically.
|
|
|
+These @dfn{operand promotions} are the same as the argument promotions
|
|
|
+except without converting @code{float} to @code{double}. In other words,
|
|
|
+the operand promotions convert
|
|
|
+
|
|
|
+@itemize @bullet
|
|
|
+@item
|
|
|
+@code{char} or @code{short} (whether signed or not) to @code{int}.
|
|
|
+
|
|
|
+@item
|
|
|
+an array to a pointer to its zeroth element, and
|
|
|
+
|
|
|
+@item
|
|
|
+a function name to a pointer to that function.
|
|
|
+@end itemize
|
|
|
+
|
|
|
+@node Common Type
|
|
|
+@section Common Type
|
|
|
+@cindex common type
|
|
|
+
|
|
|
+Arithmetic binary operators (except the shift operators) convert their
|
|
|
+operands to the @dfn{common type} before operating on them.
|
|
|
+Conditional expressions also convert the two possible results to their
|
|
|
+common type. Here are the rules for determining the common type.
|
|
|
+
|
|
|
+If one of the numbers has a floating-point type and the other is an
|
|
|
+integer, the common type is that floating-point type. For instance,
|
|
|
+
|
|
|
+@example
|
|
|
+5.6 * 2 @result{} 11.2 /* @r{a @code{double} value} */
|
|
|
+@end example
|
|
|
+
|
|
|
+If both are floating point, the type with the larger range is the
|
|
|
+common type.
|
|
|
+
|
|
|
+If both are integers but of different widths, the common type
|
|
|
+is the wider of the two.
|
|
|
+
|
|
|
+If they are integer types of the same width, the common type is
|
|
|
+unsigned if either operand is unsigned, and it's @code{long} if either
|
|
|
+operand is @code{long}. It's @code{long long} if either operand is
|
|
|
+@code{long long}.
|
|
|
+
|
|
|
+These rules apply to addition, subtraction, multiplication, division,
|
|
|
+remainder, comparisons, and bitwise operations. They also apply to
|
|
|
+the two branches of a conditional expression, and to the arithmetic
|
|
|
+done in a modifying assignment operation.
|
|
|
+
|
|
|
+@node Scope
|
|
|
+@chapter Scope
|
|
|
+@cindex scope
|
|
|
+@cindex block scope
|
|
|
+@cindex function scope
|
|
|
+@cindex function prototype scope
|
|
|
+
|
|
|
+Each definition or declaration of an identifier is visible
|
|
|
+in certain parts of the program, which is typically less than the whole
|
|
|
+of the program. The parts where it is visible are called its @dfn{scope}.
|
|
|
+
|
|
|
+Normally, declarations made at the top-level in the source -- that is,
|
|
|
+not within any blocks and function definitions -- are visible for the
|
|
|
+entire contents of the source file after that point. This is called
|
|
|
+@dfn{file scope} (@pxref{File-Scope Variables}).
|
|
|
+
|
|
|
+Declarations made within blocks of code, including within function
|
|
|
+definitions, are visible only within those blocks. This is called
|
|
|
+@dfn{block scope}. Here is an example:
|
|
|
+
|
|
|
+@example
|
|
|
+@group
|
|
|
+void
|
|
|
+foo (void)
|
|
|
+@{
|
|
|
+ int x = 42;
|
|
|
+@}
|
|
|
+@end group
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+In this example, the variable @code{x} has block scope; it is visible
|
|
|
+only within the @code{foo} function definition block. Thus, other
|
|
|
+blocks could have their own variables, also named @code{x}, without
|
|
|
+any conflict between those variables.
|
|
|
+
|
|
|
+A variable declared inside a subblock has a scope limited to
|
|
|
+that subblock,
|
|
|
+
|
|
|
+@example
|
|
|
+@group
|
|
|
+void
|
|
|
+foo (void)
|
|
|
+@{
|
|
|
+ @{
|
|
|
+ int x = 42;
|
|
|
+ @}
|
|
|
+ // @r{@code{x} is out of scope here.}
|
|
|
+@}
|
|
|
+@end group
|
|
|
+@end example
|
|
|
+
|
|
|
+If a variable declared within a block has the same name as a variable
|
|
|
+declared outside of that block, the definition within the block
|
|
|
+takes precedence during its scope:
|
|
|
+
|
|
|
+@example
|
|
|
+@group
|
|
|
+int x = 42;
|
|
|
+
|
|
|
+void
|
|
|
+foo (void)
|
|
|
+@{
|
|
|
+ int x = 17;
|
|
|
+ printf ("%d\n", x);
|
|
|
+@}
|
|
|
+@end group
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+This prints 17, the value of the variable @code{x} declared in the
|
|
|
+function body block, rather than the value of the variable @code{x} at
|
|
|
+file scope. We say that the inner declaration of @code{x}
|
|
|
+@dfn{shadows} the outer declaration, for the extent of the inner
|
|
|
+declaration's scope.
|
|
|
+
|
|
|
+A declaration with block scope can be shadowed by another declaration
|
|
|
+with the same name in a subblock.
|
|
|
+
|
|
|
+@example
|
|
|
+@group
|
|
|
+void
|
|
|
+foo (void)
|
|
|
+@{
|
|
|
+ char *x = "foo";
|
|
|
+ @{
|
|
|
+ int x = 42;
|
|
|
+ @r{@dots{}}
|
|
|
+ exit (x / 6);
|
|
|
+ @}
|
|
|
+@}
|
|
|
+@end group
|
|
|
+@end example
|
|
|
+
|
|
|
+A function parameter's scope is the entire function body, but it can
|
|
|
+be shadowed. For example:
|
|
|
+
|
|
|
+@example
|
|
|
+@group
|
|
|
+int x = 42;
|
|
|
+
|
|
|
+void
|
|
|
+foo (int x)
|
|
|
+@{
|
|
|
+ printf ("%d\n", x);
|
|
|
+@}
|
|
|
+@end group
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+This prints the value of @code{x} the function parameter, rather than
|
|
|
+the value of the file-scope variable @code{x}. However,
|
|
|
+
|
|
|
+Labels (@pxref{goto Statement}) have @dfn{function} scope: each label
|
|
|
+is visible for the whole of the containing function body, both before
|
|
|
+and after the label declaration:
|
|
|
+
|
|
|
+@example
|
|
|
+@group
|
|
|
+void
|
|
|
+foo (void)
|
|
|
+@{
|
|
|
+ @r{@dots{}}
|
|
|
+ goto bar;
|
|
|
+ @r{@dots{}}
|
|
|
+ @{ // @r{Subblock does not affect labels.}
|
|
|
+ bar:
|
|
|
+ @r{@dots{}}
|
|
|
+ @}
|
|
|
+ goto bar;
|
|
|
+@}
|
|
|
+@end group
|
|
|
+@end example
|
|
|
+
|
|
|
+Except for labels, a declared identifier is not
|
|
|
+visible to code before its declaration. For example:
|
|
|
+
|
|
|
+@example
|
|
|
+@group
|
|
|
+int x = 5;
|
|
|
+int y = x + 10;
|
|
|
+@end group
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+will work, but:
|
|
|
+
|
|
|
+@example
|
|
|
+@group
|
|
|
+int x = y + 10;
|
|
|
+int y = 5;
|
|
|
+@end group
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+cannot refer to the variable @code{y} before its declaration.
|
|
|
+
|
|
|
+@include cpp.texi
|
|
|
+
|
|
|
+@node Integers in Depth
|
|
|
+@chapter Integers in Depth
|
|
|
+
|
|
|
+This chapter explains the machine-level details of integer types: how
|
|
|
+they are represented as bits in memory, and the range of possible
|
|
|
+values for each integer type.
|
|
|
+
|
|
|
+@menu
|
|
|
+* Integer Representations:: How integer values appear in memory.
|
|
|
+* Maximum and Minimum Values:: Value ranges of integer types.
|
|
|
+@end menu
|
|
|
+
|
|
|
+@node Integer Representations
|
|
|
+@section Integer Representations
|
|
|
+
|
|
|
+@cindex integer representations
|
|
|
+@cindex representation of integers
|
|
|
+
|
|
|
+Modern computers store integer values as binary (base-2) numbers that
|
|
|
+occupy a single unit of storage, typically either as an 8-bit
|
|
|
+@code{char}, a 16-bit @code{short int}, a 32-bit @code{int}, or
|
|
|
+possibly, a 64-bit @code{long long int}. Whether a @code{long int} is
|
|
|
+a 32-bit or a 64-bit value is system dependent.@footnote{In theory,
|
|
|
+any of these types could have some other size, bit it's not worth even
|
|
|
+a minute to cater to that possibility. It never happens on
|
|
|
+GNU/Linux.}
|
|
|
+
|
|
|
+@cindex @code{CHAR_BIT}
|
|
|
+The macro @code{CHAR_BIT}, defined in @file{limits.h}, gives the number
|
|
|
+of bits in type @code{char}. On any real operating system, the value
|
|
|
+is 8.
|
|
|
+
|
|
|
+The fixed sizes of numeric types necessarily limits their @dfn{range
|
|
|
+of values}, and the particular encoding of integers decides what that
|
|
|
+range is.
|
|
|
+
|
|
|
+@cindex two's-complement representation
|
|
|
+For unsigned integers, the entire space is used to represent a
|
|
|
+nonnegative value. Signed integers are stored using
|
|
|
+@dfn{two's-complement representation}: a signed integer with @var{n}
|
|
|
+bits has a range from @math{-2@sup{(@var{n} - 1)}} to @minus{}1 to 0
|
|
|
+to 1 to @math{+2@sup{(@var{n} - 1)} - 1}, inclusive. The leftmost, or
|
|
|
+high-order, bit is called the @dfn{sign bit}.
|
|
|
+
|
|
|
+@c ??? Needs correcting
|
|
|
+
|
|
|
+There is only one value that means zero, and the most negative number
|
|
|
+lacks a positive counterpart. As a result, negating that number
|
|
|
+causes overflow; in practice, its result is that number back again.
|
|
|
+For example, a two's-complement signed 8-bit integer can represent all
|
|
|
+decimal numbers from @minus{}128 to +127. We will revisit that
|
|
|
+peculiarity shortly.
|
|
|
+
|
|
|
+Decades ago, there were computers that didn't use two's-complement
|
|
|
+representation for integers (@pxref{Integers in Depth}), but they are
|
|
|
+long gone and not worth any effort to support.
|
|
|
+
|
|
|
+@c ??? Is this duplicate?
|
|
|
+
|
|
|
+When an arithmetic operation produces a value that is too big to
|
|
|
+represent, the operation is said to @dfn{overflow}. In C, integer
|
|
|
+overflow does not interrupt the control flow or signal an error.
|
|
|
+What it does depends on signedness.
|
|
|
+
|
|
|
+For unsigned arithmetic, the result of an operation that overflows is
|
|
|
+the @var{n} low-order bits of the correct value. If the correct value
|
|
|
+is representable in @var{n} bits, that is always the result;
|
|
|
+thus we often say that ``integer arithmetic is exact,'' omitting the
|
|
|
+crucial qualifying phrase ``as long as the exact result is
|
|
|
+representable.''
|
|
|
+
|
|
|
+In principle, a C program should be written so that overflow never
|
|
|
+occurs for signed integers, but in GNU C you can specify various ways
|
|
|
+of handling such overflow (@pxref{Integer Overflow}).
|
|
|
+
|
|
|
+Integer representations are best understood by looking at a table for
|
|
|
+a tiny integer size; here are the possible values for an integer with
|
|
|
+three bits:
|
|
|
+
|
|
|
+@multitable @columnfractions .25 .25 .25 .25
|
|
|
+@headitem Unsigned @tab Signed @tab Bits @tab 2s Complement
|
|
|
+@item 0 @tab 0 @tab 000 @tab 000 (0)
|
|
|
+@item 1 @tab 1 @tab 001 @tab 111 (-1)
|
|
|
+@item 2 @tab 2 @tab 010 @tab 110 (-2)
|
|
|
+@item 3 @tab 3 @tab 011 @tab 101 (-3)
|
|
|
+@item 4 @tab -4 @tab 100 @tab 100 (-4)
|
|
|
+@item 5 @tab -3 @tab 101 @tab 011 (3)
|
|
|
+@item 6 @tab -2 @tab 110 @tab 010 (2)
|
|
|
+@item 7 @tab -1 @tab 111 @tab 001 (1)
|
|
|
+@end multitable
|
|
|
+
|
|
|
+The parenthesized decimal numbers in the last column represent the
|
|
|
+signed meanings of the two's-complement of the line's value. Recall
|
|
|
+that, in two's-complement encoding, the high-order bit is 0 when
|
|
|
+the number is nonnegative.
|
|
|
+
|
|
|
+We can now understand the peculiar behavior of negation of the
|
|
|
+most negative two's-complement integer: start with 0b100,
|
|
|
+invert the bits to get 0b011, and add 1: we get
|
|
|
+0b100, the value we started with.
|
|
|
+
|
|
|
+We can also see overflow behavior in two's-complement:
|
|
|
+
|
|
|
+@example
|
|
|
+3 + 1 = 0b011 + 0b001 = 0b100 = (-4)
|
|
|
+3 + 2 = 0b011 + 0b010 = 0b101 = (-3)
|
|
|
+3 + 3 = 0b011 + 0b011 = 0b110 = (-2)
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+A sum of two nonnegative signed values that overflows has a 1 in the
|
|
|
+sign bit, so the exact positive result is truncated to a negative
|
|
|
+value.
|
|
|
+
|
|
|
+@c =====================================================================
|
|
|
+
|
|
|
+@node Maximum and Minimum Values
|
|
|
+@section Maximum and Minimum Values
|
|
|
+@cindex maximum integer values
|
|
|
+@cindex minimum integer values
|
|
|
+@cindex integer ranges
|
|
|
+@cindex ranges of integer types
|
|
|
+@findex INT_MAX
|
|
|
+@findex UINT_MAX
|
|
|
+@findex SHRT_MAX
|
|
|
+@findex LONG_MAX
|
|
|
+@findex LLONG_MAX
|
|
|
+@findex USHRT_MAX
|
|
|
+@findex ULONG_MAX
|
|
|
+@findex ULLONG_MAX
|
|
|
+@findex CHAR_MAX
|
|
|
+@findex SCHAR_MAX
|
|
|
+@findex UCHAR_MAX
|
|
|
+
|
|
|
+For each primitive integer type, there is a standard macro defined in
|
|
|
+@file{limits.h} that gives the largest value that type can hold. For
|
|
|
+instance, for type @code{int}, the maximum value is @code{INT_MAX}.
|
|
|
+On a 32-bit computer, that is equal to 2,147,483,647. The
|
|
|
+maximum value for @code{unsigned int} is @code{UINT_MAX}, which on a
|
|
|
+32-bit computer is equal to 4,294,967,295. Likewise, there are
|
|
|
+@code{SHRT_MAX}, @code{LONG_MAX}, and @code{LLONG_MAX}, and
|
|
|
+corresponding unsigned limits @code{USHRT_MAX}, @code{ULONG_MAX}, and
|
|
|
+@code{ULLONG_MAX}.
|
|
|
+
|
|
|
+Since there are three ways to specify a @code{char} type, there are
|
|
|
+also three limits: @code{CHAR_MAX}, @code{SCHAR_MAX}, and
|
|
|
+@code{UCHAR_MAX}.
|
|
|
+
|
|
|
+For each type that is or might be signed, there is another symbol that
|
|
|
+gives the minimum value it can hold. (Just replace @code{MAX} with
|
|
|
+@code{MIN} in the names listed above.) There is no minimum limit
|
|
|
+symbol for types specified with @code{unsigned} because the
|
|
|
+minimum for them is universally zero.
|
|
|
+
|
|
|
+@code{INT_MIN} is not the negative of @code{INT_MAX}. In
|
|
|
+two's-complement representation, the most negative number is 1 less
|
|
|
+than the negative of the most positive number. Thus, @code{INT_MIN}
|
|
|
+on a 32-bit computer has the value @minus{}2,147,483,648. You can't
|
|
|
+actually write the value that way in C, since it would overflow.
|
|
|
+That's a good reason to use @code{INT_MIN} to specify
|
|
|
+that value. Its definition is written to avoid overflow.
|
|
|
+
|
|
|
+@include fp.texi
|
|
|
+
|
|
|
+@node Compilation
|
|
|
+@chapter Compilation
|
|
|
+@cindex object file
|
|
|
+@cindex compilation module
|
|
|
+@cindex make rules
|
|
|
+
|
|
|
+Early in the manual we explained how to compile a simple C program
|
|
|
+that consists of a single source file (@pxref{Compile Example}).
|
|
|
+However, we handle only short programs that way. A typical C program
|
|
|
+consists of many source files, each of which is a separate
|
|
|
+@dfn{compilation module}---meaning that it has to be compiled
|
|
|
+separately.
|
|
|
+
|
|
|
+The full details of how to compile with GCC are documented in xxxx.
|
|
|
+@c ??? ref
|
|
|
+Here we give only a simple introduction.
|
|
|
+
|
|
|
+These are the commands to compile two compilation modules,
|
|
|
+@file{foo.c} and @file{bar.c}, with a command for each module:
|
|
|
+
|
|
|
+@example
|
|
|
+gcc -c -O -g foo.c
|
|
|
+gcc -c -O -g bar.c
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+In these commands, @option{-g} says to generate debugging information,
|
|
|
+@option{-O} says to do some optimization, and @option{-c} says to put
|
|
|
+the compiled code for that module into a corresponding @dfn{object
|
|
|
+file} and go no further. The object file for @file{foo.c} is called
|
|
|
+@file{foo.o}, and so on.
|
|
|
+
|
|
|
+If you wish, you can specify the additional options @option{-Wformat
|
|
|
+-Wparenthesis -Wstrict-prototypes}, which request additional warnings.
|
|
|
+
|
|
|
+One reason to divide a large program into multiple compilation modules
|
|
|
+is to control how each module can access the internals of the others.
|
|
|
+When a module declares a function or variable @code{extern}, other
|
|
|
+modules can access it. The other functions and variables in
|
|
|
+a module can't be accessed from outside that module.
|
|
|
+
|
|
|
+The other reason for using multiple modules is so that changing
|
|
|
+one source file does not require recompiling all of them in order
|
|
|
+to try the modified program. Dividing a large program into many
|
|
|
+substantial modules in this way typically makes recompilation much faster.
|
|
|
+
|
|
|
+@cindex linking object files
|
|
|
+After you compile all the program's modules, in order to run the
|
|
|
+program you must @dfn{link} the object files into a combined
|
|
|
+executable, like this:
|
|
|
+
|
|
|
+@example
|
|
|
+gcc -o foo foo.o bar.o
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+In this command, @option{-o foo} species the file name for the
|
|
|
+executable file, and the other arguments are the object files to link.
|
|
|
+Always specify the executable file name in a command that generates
|
|
|
+one.
|
|
|
+
|
|
|
+Normally we don't run any of these commands directly. Instead we
|
|
|
+write a set of @dfn{make rules} for the program, then use the
|
|
|
+@command{make} program to recompile only the source files that need to
|
|
|
+be recompiled.
|
|
|
+
|
|
|
+@c ??? ref to make manual
|
|
|
+
|
|
|
+@node Directing Compilation
|
|
|
+@chapter Directing Compilation
|
|
|
+
|
|
|
+This chapter describes C constructs that don't alter the program's
|
|
|
+meaning @emph{as such}, but rather direct the compiler how to treat
|
|
|
+some aspects of the program.
|
|
|
+
|
|
|
+@menu
|
|
|
+* Pragmas:: Controling compilation of some constructs.
|
|
|
+* Static Assertions:: Compile-time tests for conditions.
|
|
|
+@end menu
|
|
|
+
|
|
|
+@node Pragmas
|
|
|
+@section Pragmas
|
|
|
+
|
|
|
+A @dfn{pragma} is an annotation in a program that gives direction to
|
|
|
+the compiler.
|
|
|
+
|
|
|
+@menu
|
|
|
+* Pragma Basics:: Pragma syntax and usage.
|
|
|
+* Severity Pragmas:: Settings for compile-time pragma output.
|
|
|
+* Optimization Pragmas:: Controlling optimizations.
|
|
|
+@end menu
|
|
|
+
|
|
|
+@c See also @ref{Macro Pragmas}, which save and restore macro definitions.
|
|
|
+
|
|
|
+@node Pragma Basics
|
|
|
+@subsection Pragma Basics
|
|
|
+
|
|
|
+C defines two syntactical forms for pragmas, the line form and the
|
|
|
+token form. You can write any pragma in either form, with the same
|
|
|
+meaning.
|
|
|
+
|
|
|
+The line form is a line in the source code, like this:
|
|
|
+
|
|
|
+@example
|
|
|
+#pragma @var{line}
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+The line pragma has no effect on the parsing of the lines around it.
|
|
|
+This form has the drawback that it can't be generated by a macro expansion.
|
|
|
+
|
|
|
+The token form is a series of tokens; it can appear anywhere in the
|
|
|
+program between the other tokens.
|
|
|
+
|
|
|
+@example
|
|
|
+_Pragma (@var{stringconstant})
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+The pragma has no effect on the syntax of the tokens that surround it;
|
|
|
+thus, here's a pragma in the middle of an @code{if} statement:
|
|
|
+
|
|
|
+@example
|
|
|
+if _Pragma ("hello") (x > 1)
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+However, that's an unclear thing to do; for the sake of
|
|
|
+understandability, it is better to put a pragma on a line by itself
|
|
|
+and not embedded in the middle of another construct.
|
|
|
+
|
|
|
+Both forms of pragma have a textual argument. In a line pragma, the
|
|
|
+text is the rest of the line. The textual argument to @code{_Pragma}
|
|
|
+uses the same syntax as a C string constant: surround the text with
|
|
|
+two @samp{"} characters, and add a backslash before each @samp{"} or
|
|
|
+@samp{\} character in it.
|
|
|
+
|
|
|
+With either syntax, the textual argument specifies what to do.
|
|
|
+It begins with one or several words that specify the operation.
|
|
|
+If the compiler does not recognize them, it ignores the pragma.
|
|
|
+
|
|
|
+Here are the pragma operations supported in GNU C@.
|
|
|
+
|
|
|
+@c ??? Verify font for []
|
|
|
+@table @code
|
|
|
+@item #pragma GCC dependency "@var{file}" [@var{message}]
|
|
|
+@itemx _Pragma ("GCC dependency \"@var{file}\" [@var{message}]")
|
|
|
+Declares that the current source file depends on @var{file}, so GNU C
|
|
|
+compares the file times and gives a warning if @var{file} is newer
|
|
|
+than the current source file.
|
|
|
+
|
|
|
+This directive searches for @var{file} the way @code{#include}
|
|
|
+searches for a non-system header file.
|
|
|
+
|
|
|
+If @var{message} is given, the warning message includes that text.
|
|
|
+
|
|
|
+Examples:
|
|
|
+
|
|
|
+@example
|
|
|
+#pragma GCC dependency "parse.y"
|
|
|
+_pragma ("GCC dependency \"/usr/include/time.h\" \
|
|
|
+rerun fixincludes")
|
|
|
+@end example
|
|
|
+
|
|
|
+@item #pragma GCC poison @var{identifiers}
|
|
|
+@itemx _Pragma ("GCC poison @var{identifiers}")
|
|
|
+Poisons the identifiers listed in @var{identifiers}.
|
|
|
+
|
|
|
+This is useful to make sure all mention of @var{identifiers} has been
|
|
|
+deleted from the program and that no reference to them creeps back in.
|
|
|
+If any of those identifiers appears anywhere in the source after the
|
|
|
+directive, it causes a compilation error. For example,
|
|
|
+
|
|
|
+@example
|
|
|
+#pragma GCC poison printf sprintf fprintf
|
|
|
+sprintf(some_string, "hello");
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+generates an error.
|
|
|
+
|
|
|
+If a poisoned identifier appears as part of the expansion of a macro
|
|
|
+that was defined before the identifier was poisoned, it will @emph{not}
|
|
|
+cause an error. Thus, system headers that define macros that use
|
|
|
+the identifier will not cause errors.
|
|
|
+
|
|
|
+For example,
|
|
|
+
|
|
|
+@example
|
|
|
+#define strrchr rindex
|
|
|
+_Pragma ("GCC poison rindex")
|
|
|
+strrchr(some_string, 'h');
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+does not cause a compilation error.
|
|
|
+
|
|
|
+@item #pragma GCC system_header
|
|
|
+@itemx _Pragma ("GCC system_header")
|
|
|
+Specify treating the rest of the current source file as if it came
|
|
|
+from a system header file. @xref{System Headers, System Headers,
|
|
|
+System Headers, gcc, Using the GNU Compiler Collection}.
|
|
|
+
|
|
|
+@item #pragma GCC warning @var{message}
|
|
|
+@itemx _Pragma ("GCC warning @var{message}")
|
|
|
+Equivalent to @code{#warning}. Its advantage is that the
|
|
|
+@code{_Pragma} form can be included in a macro definition.
|
|
|
+
|
|
|
+@item #pragma GCC error @var{message}
|
|
|
+@itemx _Pragma ("GCC error @var{message}")
|
|
|
+Equivalent to @code{#error}. Its advantage is that the
|
|
|
+@code{_Pragma} form can be included in a macro definition.
|
|
|
+
|
|
|
+@item #pragma GCC message @var{message}
|
|
|
+@itemx _Pragma ("GCC message @var{message}")
|
|
|
+Similar to @samp{GCC warning} and @samp{GCC error}, this simply prints an
|
|
|
+informational message, and could be used to include additional warning
|
|
|
+or error text without triggering more warnings or errors. (Note that
|
|
|
+unlike @samp{warning} and @samp{error}, @samp{message} does not include
|
|
|
+@samp{GCC} as part of the pragma.)
|
|
|
+@end table
|
|
|
+
|
|
|
+@node Severity Pragmas
|
|
|
+@subsection Severity Pragmas
|
|
|
+
|
|
|
+These pragmas control the severity of classes of diagnostics.
|
|
|
+You can specify the class of diagnostic with the GCC option that causes
|
|
|
+those diagnostics to be generated.
|
|
|
+
|
|
|
+@table @code
|
|
|
+@item #pragma GCC diagnostic error @var{option}
|
|
|
+@itemx _Pragma ("GCC diagnostic error @var{option}")
|
|
|
+For code following this pragma, treat diagnostics of the variety
|
|
|
+specified by @var{option} as errors. For example:
|
|
|
+
|
|
|
+@example
|
|
|
+_Pragma ("GCC diagnostic error -Wformat")
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+specifies to treat diagnostics enabled by the @var{-Wformat} option
|
|
|
+as errors rather than warnings.
|
|
|
+
|
|
|
+@item #pragma GCC diagnostic warning @var{option}
|
|
|
+@itemx _Pragma ("GCC diagnostic warning @var{option}")
|
|
|
+For code following this pragma, treat diagnostics of the variety
|
|
|
+specified by @var{option} as warnings. This overrides the
|
|
|
+@var{-Werror} option which says to treat warnings as errors.
|
|
|
+
|
|
|
+@item #pragma GCC diagnostic ignore @var{option}
|
|
|
+@itemx _Pragma ("GCC diagnostic ignore @var{option}")
|
|
|
+For code following this pragma, refrain from reporting any diagnostics
|
|
|
+of the variety specified by @var{option}.
|
|
|
+
|
|
|
+@item #pragma GCC diagnostic push
|
|
|
+@itemx _Pragma ("GCC diagnostic push")
|
|
|
+@itemx #pragma GCC diagnostic pop
|
|
|
+@itemx _Pragma ("GCC diagnostic pop")
|
|
|
+These pragmas maintain a stack of states for severity settings.
|
|
|
+@samp{GCC diagnostic push} saves the current settings on the stack,
|
|
|
+and @samp{GCC diagnostic pop} pops the last stack item and restores
|
|
|
+the current settings from that.
|
|
|
+
|
|
|
+@samp{GCC diagnostic pop} when the severity setting stack is empty
|
|
|
+restores the settings to what they were at the start of compilation.
|
|
|
+
|
|
|
+Here is an example:
|
|
|
+
|
|
|
+@example
|
|
|
+_Pragma ("GCC diagnostic error -Wformat")
|
|
|
+
|
|
|
+/* @r{@option{-Wformat} messages treated as errors. } */
|
|
|
+
|
|
|
+_Pragma ("GCC diagnostic push")
|
|
|
+_Pragma ("GCC diagnostic warning -Wformat")
|
|
|
+
|
|
|
+/* @r{@option{-Wformat} messages treated as warnings. } */
|
|
|
+
|
|
|
+_Pragma ("GCC diagnostic push")
|
|
|
+_Pragma ("GCC diagnostic ignored -Wformat")
|
|
|
+
|
|
|
+/* @r{@option{-Wformat} messages suppressed. } */
|
|
|
+
|
|
|
+_Pragma ("GCC diagnostic pop")
|
|
|
+
|
|
|
+/* @r{@option{-Wformat} messages treated as warnings again. } */
|
|
|
+
|
|
|
+_Pragma ("GCC diagnostic pop")
|
|
|
+
|
|
|
+/* @r{@option{-Wformat} messages treated as errors again. } */
|
|
|
+
|
|
|
+/* @r{This is an excess @samp{pop} that matches no @samp{push}. } */
|
|
|
+_Pragma ("GCC diagnostic pop")
|
|
|
+
|
|
|
+/* @r{@option{-Wformat} messages treated once again}
|
|
|
+ @r{as specified by the GCC command-line options.} */
|
|
|
+@end example
|
|
|
+@end table
|
|
|
+
|
|
|
+@node Optimization Pragmas
|
|
|
+@subsection Optimization Pragmas
|
|
|
+
|
|
|
+These pragmas enable a particular optimization for specific function
|
|
|
+definitions. The settings take effect at the end of a function
|
|
|
+definition, so the clean place to use these pragmas is between
|
|
|
+function definitions.
|
|
|
+
|
|
|
+@table @code
|
|
|
+@item #pragma GCC optimize @var{optimization}
|
|
|
+@itemx _Pragma ("GCC optimize @var{optimization}")
|
|
|
+These pragmas enable the optimization @var{optimization} for the
|
|
|
+following functions. For example,
|
|
|
+
|
|
|
+@example
|
|
|
+_Pragma ("GCC optimize -fforward-propagate")
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+says to apply the @samp{forward-propagate} optimization to all
|
|
|
+following function definitions. Specifying optimizations for
|
|
|
+individual functions, rather than for the entire program, is rare but
|
|
|
+can be useful for getting around a bug in the compiler.
|
|
|
+
|
|
|
+If @var{optimization} does not correspond to a defined optimization
|
|
|
+option, the pragma is erroneous. To turn off an optimization, use the
|
|
|
+corresponding @samp{-fno-} option, such as
|
|
|
+@samp{-fno-forward-propagate}.
|
|
|
+
|
|
|
+@item #pragma GCC target @var{optimizations}
|
|
|
+@itemx _Pragma ("GCC target @var{optimizations}")
|
|
|
+The pragma @samp{GCC target} is similar to @samp{GCC optimize} but is
|
|
|
+used for platform-specific optimizations. Thus,
|
|
|
+
|
|
|
+@example
|
|
|
+_Pragma ("GCC target popcnt")
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+activates the optimization @samp{popcnt} for all
|
|
|
+following function definitions. This optimization is supported
|
|
|
+on a few common targets but not on others.
|
|
|
+
|
|
|
+@item #pragma GCC push_options
|
|
|
+@itemx _Pragma ("GCC push_options")
|
|
|
+The @samp{push_options} pragma saves on a stack the current settings
|
|
|
+specified with the @samp{target} and @samp{optimize} pragmas.
|
|
|
+
|
|
|
+@item #pragma GCC pop_options
|
|
|
+@itemx _Pragma ("GCC pop_options")
|
|
|
+The @samp{pop_options} pragma pops saved settings from that stack.
|
|
|
+
|
|
|
+Here's an example of using this stack.
|
|
|
+
|
|
|
+@example
|
|
|
+_Pragma ("GCC push_options")
|
|
|
+_Pragma ("GCC optimize forward-propagate")
|
|
|
+
|
|
|
+/* @r{Functions to compile}
|
|
|
+ @r{with the @code{forward-propagate} optimization.} */
|
|
|
+
|
|
|
+_Pragma ("GCC pop_options")
|
|
|
+/* @r{Ends enablement of @code{forward-propagate}.} */
|
|
|
+@end example
|
|
|
+
|
|
|
+@item #pragma GCC reset_options
|
|
|
+@itemx _Pragma ("GCC reset_options")
|
|
|
+Clears all pragma-defined @samp{target} and @samp{optimize}
|
|
|
+optimization settings.
|
|
|
+@end table
|
|
|
+
|
|
|
+@node Static Assertions
|
|
|
+@section Static Assertions
|
|
|
+@cindex static assertions
|
|
|
+@findex _Static_assert
|
|
|
+
|
|
|
+You can add compiler-time tests for necessary conditions into your
|
|
|
+code using @code{_Static_assert}. This can be useful, for example, to
|
|
|
+check that the compilation target platform supports the type sizes
|
|
|
+that the code expects. For example,
|
|
|
+
|
|
|
+@example
|
|
|
+_Static_assert ((sizeof (long int) >= 8),
|
|
|
+ "long int needs to be at least 8 bytes");
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+reports a compile-time error if compiled on a system with long
|
|
|
+integers smaller than 8 bytes, with @samp{long int needs to be at
|
|
|
+least 8 bytes} as the error message.
|
|
|
+
|
|
|
+Since calls @code{_Static_assert} are processed at compile time, the
|
|
|
+expression must be computable at compile time and the error message
|
|
|
+must be a literal string. The expression can refer to the sizes of
|
|
|
+variables, but can't refer to their values. For example, the
|
|
|
+following static assertion is invalid for two reasons:
|
|
|
+
|
|
|
+@example
|
|
|
+char *error_message
|
|
|
+ = "long int needs to be at least 8 bytes";
|
|
|
+int size_of_long_int = sizeof (long int);
|
|
|
+
|
|
|
+_Static_assert (size_of_long_int == 8, error_message);
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+The expression @code{size_of_long_int == 8} isn't computable at
|
|
|
+compile time, and the error message isn't a literal string.
|
|
|
+
|
|
|
+You can, though, use preprocessor definition values with
|
|
|
+@code{_Static_assert}:
|
|
|
+
|
|
|
+@example
|
|
|
+#define LONG_INT_ERROR_MESSAGE "long int needs to be \
|
|
|
+at least 8 bytes"
|
|
|
+
|
|
|
+_Static_assert ((sizeof (long int) == 8),
|
|
|
+ LONG_INT_ERROR_MESSAGE);
|
|
|
+@end example
|
|
|
+
|
|
|
+Static assertions are permitted wherever a statement or declaration is
|
|
|
+permitted, including at top level in the file, and also inside the
|
|
|
+definition of a type.
|
|
|
+
|
|
|
+@example
|
|
|
+union y
|
|
|
+@{
|
|
|
+ int i;
|
|
|
+ int *ptr;
|
|
|
+ _Static_assert (sizeof (int *) == sizeof (int),
|
|
|
+ "Pointer and int not same size");
|
|
|
+@};
|
|
|
+@end example
|
|
|
+
|
|
|
+@node Type Alignment
|
|
|
+@appendix Type Alignment
|
|
|
+@cindex type alignment
|
|
|
+@cindex alignment of type
|
|
|
+@findex _Alignof
|
|
|
+@findex __alignof__
|
|
|
+
|
|
|
+Code for device drivers and other communication with low-level
|
|
|
+hardware sometimes needs to be concerned with the alignment of
|
|
|
+data objects in memory.
|
|
|
+
|
|
|
+Each data type has a required @dfn{alignment}, always a power of 2,
|
|
|
+that says at which memory addresses an object of that type can validly
|
|
|
+start. A valid address for the type must be a multiple of its
|
|
|
+alignment. If a type's alignment is 1, that means it can validly
|
|
|
+start at any address. If a type's alignment is 2, that means it can
|
|
|
+only start at an even address. If a type's alignment is 4, that means
|
|
|
+it can only start at an address that is a multiple of 4.
|
|
|
+
|
|
|
+The alignment of a type (except @code{char}) can vary depending on the
|
|
|
+kind of computer in use. To refer to the alignment of a type in a C
|
|
|
+program, use @code{_Alignof}, whose syntax parallels that of
|
|
|
+@code{sizeof}. Like @code{sizeof}, @code{_Alignof} is a compile-time
|
|
|
+operation, and it doesn't compute the value of the expression used
|
|
|
+as its argument.
|
|
|
+
|
|
|
+Nominally, each integer and floating-point type has an alignment equal to
|
|
|
+the largest power of 2 that divides its size. Thus, @code{int} with
|
|
|
+size 4 has a nominal alignment of 4, and @code{long long int} with
|
|
|
+size 8 has a nominal alignment of 8.
|
|
|
+
|
|
|
+However, each kind of computer generally has a maximum alignment, and
|
|
|
+no type needs more alignment than that. If the computer's maximum
|
|
|
+alignment is 4 (which is common), then no type's alignment is more
|
|
|
+than 4.
|
|
|
+
|
|
|
+The size of any type is always a multiple of its alignment; that way,
|
|
|
+in an array whose elements have that type, all the elements are
|
|
|
+properly aligned if the first one is.
|
|
|
+
|
|
|
+These rules apply to all real computers today, but some embedded
|
|
|
+controllers have odd exceptions. We don't have references to cite for
|
|
|
+them.
|
|
|
+@c We can't cite a nonfree manual as documentation.
|
|
|
+
|
|
|
+Ordinary C code guarantees that every object of a given type is in
|
|
|
+fact aligned as that type requires.
|
|
|
+
|
|
|
+If the operand of @code{_Alignof} is a structure field, the value
|
|
|
+is the alignment it requires. It may have a greater alignment by
|
|
|
+coincidence, due to the other fields, but @code{_Alignof} is not
|
|
|
+concerned about that. @xref{Structures}.
|
|
|
+
|
|
|
+Older versions of GNU C used the keyword @code{__alignof__} for this,
|
|
|
+but now that the feature has been standardized, it is better
|
|
|
+to use the standard keyword @code{_Alignof}.
|
|
|
+
|
|
|
+@findex _Alignas
|
|
|
+@findex __aligned__
|
|
|
+You can explicitly specify an alignment requirement for a particular
|
|
|
+variable or structure field by adding @code{_Alignas
|
|
|
+(@var{alignment})} to the declaration, where @var{alignment} is a
|
|
|
+power of 2 or a type name. For instance:
|
|
|
+
|
|
|
+@example
|
|
|
+char _Alignas (8) x;
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+or
|
|
|
+
|
|
|
+@example
|
|
|
+char _Alignas (double) x;
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+specifies that @code{x} must start on an address that is a multiple of
|
|
|
+8. However, if @var{alignment} exceeds the maximum alignment for the
|
|
|
+machine, that maximum is how much alignment @code{x} will get.
|
|
|
+
|
|
|
+The older GNU C syntax for this feature looked like
|
|
|
+@code{__attribute__ ((__aligned__ (@var{alignment})))} to the
|
|
|
+declaration, and was added after the variable. For instance:
|
|
|
+
|
|
|
+@example
|
|
|
+char x __attribute__ ((__aligned__ 8));
|
|
|
+@end example
|
|
|
+
|
|
|
+@xref{Attributes}.
|
|
|
+
|
|
|
+@node Aliasing
|
|
|
+@appendix Aliasing
|
|
|
+@cindex aliasing (of storage)
|
|
|
+@cindex pointer type conversion
|
|
|
+@cindex type conversion, pointer
|
|
|
+
|
|
|
+We have already presented examples of casting a @code{void *} pointer
|
|
|
+to another pointer type, and casting another pointer type to
|
|
|
+@code{void *}.
|
|
|
+
|
|
|
+One common kind of pointer cast is guaranteed safe: casting the value
|
|
|
+returned by @code{malloc} and related functions (@pxref{Dynamic Memory
|
|
|
+Allocation}). It is safe because these functions do not save the
|
|
|
+pointer anywhere else; the only way the program will access the newly
|
|
|
+allocated memory is via the pointer just returned.
|
|
|
+
|
|
|
+In fact, C allows casting any pointer type to any other pointer type.
|
|
|
+Using this to access the same place in memory using two
|
|
|
+different data types is called @dfn{aliasing}.
|
|
|
+
|
|
|
+Aliasing is necessary in some programs that do sophisticated memory
|
|
|
+management, such as GNU Emacs, but most C programs don't need to do
|
|
|
+aliasing. When it isn't needed, @strong{stay away from it!} To do
|
|
|
+aliasing correctly requires following the rules stated below.
|
|
|
+Otherwise, the aliasing may result in malfunctions when the program
|
|
|
+runs.
|
|
|
+
|
|
|
+The rest of this appendix explains the pitfalls and rules of aliasing.
|
|
|
+
|
|
|
+@menu
|
|
|
+* Aliasing Alignment:: Memory alignment considerations for
|
|
|
+ casting between pointer types.
|
|
|
+* Aliasing Length:: Type size considerations for
|
|
|
+ casting between pointer types.
|
|
|
+* Aliasing Type Rules:: Even when type alignment and size matches,
|
|
|
+ aliasing can still have surprising results.
|
|
|
+
|
|
|
+@end menu
|
|
|
+
|
|
|
+@node Aliasing Alignment
|
|
|
+@appendixsection Aliasing and Alignment
|
|
|
+
|
|
|
+In order for a type-converted pointer to be valid, it must have the
|
|
|
+alignment that the new pointer type requires. For instance, on most
|
|
|
+computers, @code{int} has alignment 4; the address of an @code{int}
|
|
|
+must be a multiple of 4. However, @code{char} has alignment 1, so the
|
|
|
+address of a @code{char} is usually not a multiple of 4. Taking the
|
|
|
+address of such a @code{char} and casting it to @code{int *} probably
|
|
|
+results in an invalid pointer. Trying to dereference it may cause a
|
|
|
+@code{SIGBUS} signal, depending on the platform in use (@pxref{Signals}).
|
|
|
+
|
|
|
+@example
|
|
|
+foo ()
|
|
|
+@{
|
|
|
+ char i[4];
|
|
|
+ int *p = (int *) &i[1]; /* @r{Misaligned pointer!} */
|
|
|
+ return *p; /* @r{Crash!} */
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+This requirement is never a problem when casting the return value
|
|
|
+of @code{malloc} because that function always returns a pointer
|
|
|
+with as much alignment as any type can require.
|
|
|
+
|
|
|
+@node Aliasing Length
|
|
|
+@appendixsection Aliasing and Length
|
|
|
+
|
|
|
+When converting a pointer to a different pointer type, make sure the
|
|
|
+object it really points to is at least as long as the target of the
|
|
|
+converted pointer. For instance, suppose @code{p} has type @code{int
|
|
|
+*} and it's cast as follows:
|
|
|
+
|
|
|
+@example
|
|
|
+int *p;
|
|
|
+
|
|
|
+struct
|
|
|
+ @{
|
|
|
+ double d, e, f;
|
|
|
+ @} foo;
|
|
|
+
|
|
|
+struct foo *q = (struct foo *)p;
|
|
|
+
|
|
|
+q->f = 5.14159;
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+the value @code{q->f} will run past the end of the @code{int} that
|
|
|
+@code{p} points to. If @code{p} was initialized to the start of an
|
|
|
+array of type @code{int[6]}, the object is long enough for three
|
|
|
+@code{double}s. But if @code{p} points to something shorter,
|
|
|
+@code{q->f} will run on beyond the end of that, overlaying some other
|
|
|
+data. Storing that will garble that other data. Or it could extend
|
|
|
+past the end of memory space and cause a @code{SIGSEGV} signal
|
|
|
+(@pxref{Signals}).
|
|
|
+
|
|
|
+@node Aliasing Type Rules
|
|
|
+@appendixsection Type Rules for Aliasing
|
|
|
+
|
|
|
+C code that converts a pointer to a different pointer type can use the
|
|
|
+pointers to access the same memory locations with two different data
|
|
|
+types. If the same address is accessed with different types in a
|
|
|
+single control thread, optimization can make the code do surprising
|
|
|
+things (in effect, make it malfunction).
|
|
|
+
|
|
|
+Here's a concrete example where aliasing that can change the code's
|
|
|
+behavior when it is optimized. We assume that @code{float} is 4 bytes
|
|
|
+long, like @code{int}, and so is every pointer. Thus, the structures
|
|
|
+@code{struct a} and @code{struct b} are both 8 bytes.
|
|
|
+
|
|
|
+@example
|
|
|
+#include <stdio.h>
|
|
|
+struct a @{ int size; char *data; @};
|
|
|
+struct b @{ float size; char *data; @};
|
|
|
+
|
|
|
+void sub (struct a *p, struct b *q)
|
|
|
+@{
|
|
|
+ int x;
|
|
|
+ p->size = 0;
|
|
|
+ q->size = 1;
|
|
|
+ x = p->size;
|
|
|
+ printf("x =%d\n", x);
|
|
|
+ printf("p->size =%d\n", (int)p->size);
|
|
|
+ printf("q->size =%d\n", (int)q->size);
|
|
|
+@}
|
|
|
+
|
|
|
+int main(void)
|
|
|
+@{
|
|
|
+ struct a foo;
|
|
|
+ struct a *p = &foo;
|
|
|
+ struct b *q = (struct b *) &foo;
|
|
|
+
|
|
|
+ sub (p, q);
|
|
|
+@}
|
|
|
+@end example
|
|
|
+
|
|
|
+This code works as intended when compiled without optimization. All
|
|
|
+the operations are carried out sequentially as written. The code
|
|
|
+sets @code{x} to @code{p->size}, but what it actually gets is the
|
|
|
+bits of the floating point number 1, as type @code{int}.
|
|
|
+
|
|
|
+However, when optimizing, the compiler is allowed to assume
|
|
|
+(mistakenly, here) that @code{q} does not point to the same storage as
|
|
|
+@code{p}, because their data types are not allowed to alias.
|
|
|
+
|
|
|
+From this assumption, the compiler can deduce (falsely, here) that the
|
|
|
+assignment into @code{q->size} has no effect on the value of
|
|
|
+@code{p->size}, which must therefore still be 0. Thus, @code{x} will
|
|
|
+be set to 0.
|
|
|
+
|
|
|
+GNU C, following the C standard, @emph{defines} this optimization as
|
|
|
+legitimate. Code that misbehaves when optimized following these rules
|
|
|
+is, by definition, incorrect C code.
|
|
|
+
|
|
|
+The rules for storage aliasing in C are based on the two data types:
|
|
|
+the type of the object, and the type it is accessed through. The
|
|
|
+rules permit accessing part of a storage object of type @var{t} using
|
|
|
+only these types:
|
|
|
+
|
|
|
+@itemize @bullet
|
|
|
+@item
|
|
|
+@var{t}.
|
|
|
+
|
|
|
+@item
|
|
|
+A type compatible with @var{t}. @xref{Compatible Types}.
|
|
|
+
|
|
|
+@item
|
|
|
+A signed or unsigned version of one of the above.
|
|
|
+
|
|
|
+@item
|
|
|
+A qualifed version of one of the above.
|
|
|
+@xref{Type Qualifiers}.
|
|
|
+
|
|
|
+@item
|
|
|
+An array, structure (@pxref{Structures}), or union type
|
|
|
+(@code{Unions}) that contains one of the above, either directly as a
|
|
|
+field or through multiple levels of fields. If @var{t} is
|
|
|
+@code{double}, this would include @code{struct s @{ union @{ double
|
|
|
+d[2]; int i[4]; @} u; int i; @};} because there's a @code{double}
|
|
|
+inside it somewhere.
|
|
|
+
|
|
|
+@item
|
|
|
+A character type.
|
|
|
+@end itemize
|
|
|
+
|
|
|
+What do these rules say about the example in this subsection?
|
|
|
+
|
|
|
+For @code{foo.size} (equivalently, @code{a->size}), @var{t} is
|
|
|
+@code{int}. The type @code{float} is not allowed as an aliasing type
|
|
|
+by those rules, so @code{b->size} is not supposed to alias with
|
|
|
+elements of @code{j}. Based on that assumption, GNU C makes a
|
|
|
+permitted optimization that was not, in this case, consistent with
|
|
|
+what the programmer intended the program to do.
|
|
|
+
|
|
|
+Whether GCC actually performs type-based aliasing analysis depends on
|
|
|
+the details of the code. GCC has other ways to determine (in some cases)
|
|
|
+whether objects alias, and if it gets a reliable answer that way, it won't
|
|
|
+fall back on type-based heuristics.
|
|
|
+
|
|
|
+@c @opindex -fno-strict-aliasing
|
|
|
+The importance of knowing the type-based aliasing rules is not so as
|
|
|
+to ensure that the optimization is done where it would be safe, but so
|
|
|
+as to ensure it is @emph{not} done in a way that would break the
|
|
|
+program. You can turn off type-based aliasing analysis by giving GCC
|
|
|
+the option @option{-fno-strict-aliasing}.
|
|
|
+
|
|
|
+@node Digraphs
|
|
|
+@appendix Digraphs
|
|
|
+@cindex digraphs
|
|
|
+
|
|
|
+C accepts aliases for certain characters. Apparently in the 1990s
|
|
|
+some computer systems had trouble inputting these characters, or
|
|
|
+trouble displaying them. These digraphs almost never appear in C
|
|
|
+programs nowadays, but we mention them for completeness.
|
|
|
+
|
|
|
+@table @samp
|
|
|
+@item <:
|
|
|
+An alias for @samp{[}.
|
|
|
+@item :>
|
|
|
+An alias for @samp{]}.
|
|
|
+@item <%
|
|
|
+An alias for @samp{@{}.
|
|
|
+@item %>
|
|
|
+An alias for @samp{@}}.
|
|
|
+@item %:
|
|
|
+An alias for @samp{#},
|
|
|
+used for preprocessing directives (@pxref{Directives}) and
|
|
|
+macros (@pxref{Macros}).
|
|
|
+@end table
|
|
|
+
|
|
|
+@node Attributes
|
|
|
+@appendix Attributes in Declarations
|
|
|
+@cindex attributes
|
|
|
+@findex __attribute__
|
|
|
+
|
|
|
+You can specify certain additional requirements in a declaration, to
|
|
|
+get fine-grained control over code generation, and helpful
|
|
|
+informational messages during compilation. We use a few attributes in
|
|
|
+code examples throughout this manual, including
|
|
|
+
|
|
|
+@table @code
|
|
|
+@item aligned
|
|
|
+The @code{aligned} attribute specifies a minimum alignment for a
|
|
|
+variable or structure field, measured in bytes:
|
|
|
+
|
|
|
+@example
|
|
|
+int foo __attribute__ ((aligned (8))) = 0;
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+This directs GNU C to allocate @code{foo} at an address that is a
|
|
|
+multiple of 8 bytes. However, you can't force an alignment bigger
|
|
|
+than the computer's maximum meaningful alignment.
|
|
|
+
|
|
|
+@item packed
|
|
|
+The @code{packed} attribute specifies to compact the fields of a
|
|
|
+structure by not leaving gaps between fields. For example,
|
|
|
+
|
|
|
+@example
|
|
|
+struct __attribute__ ((packed)) bar
|
|
|
+@{
|
|
|
+ char a;
|
|
|
+ int b;
|
|
|
+@};
|
|
|
+@end example
|
|
|
+
|
|
|
+@noindent
|
|
|
+allocates the integer field @code{b} at byte 1 in the structure,
|
|
|
+immediately after the character field @code{a}. The packed structure
|
|
|
+is just 5 bytes long (assuming @code{int} is 4 bytes) and its
|
|
|
+alignment is 1, that of @code{char}.
|
|
|
+
|
|
|
+@item deprecated
|
|
|
+Applicable to both variables and functions, the @code{deprecated}
|
|
|
+attribute tells the compiler to issue a warning if the variable or
|
|
|
+function is ever used in the source file.
|
|
|
+
|
|
|
+@example
|
|
|
+int old_foo __attribute__ ((deprecated));
|
|
|
+
|
|
|
+int old_quux () __attribute__ ((deprecated));
|
|
|
+@end example
|
|
|
+
|
|
|
+@item __noinline__
|
|
|
+The @code{__noinline__} attribute, in a function's declaration or
|
|
|
+definition, specifies never to inline calls to that function. All
|
|
|
+calls to that function, in a compilation unit where it has this
|
|
|
+attribute, will be compiled to invoke the separately compiled
|
|
|
+function. @xref{Inline Function Definitions}.
|
|
|
+
|
|
|
+@item __noclone__
|
|
|
+The @code{__noclone__} attribute, in a function's declaration or
|
|
|
+definition, specifies never to clone that function. Thus, there will
|
|
|
+be only one compiled version of the function. @xref{Label Value
|
|
|
+Caveats}, for more information about cloning.
|
|
|
+
|
|
|
+@item always_inline
|
|
|
+The @code{always_inline} attribute, in a function's declaration or
|
|
|
+definition, specifies to inline all calls to that function (unless
|
|
|
+something about the function makes inlining impossible). This applies
|
|
|
+to all calls to that function in a compilation unit where it has this
|
|
|
+attribute. @xref{Inline Function Definitions}.
|
|
|
+
|
|
|
+@item gnu_inline
|
|
|
+The @code{gnu_inline} attribute, in a function's declaration or
|
|
|
+definition, specifies to handle the @code{inline} keywprd the way GNU
|
|
|
+C originally implemented it, many years before ISO C said anything
|
|
|
+about inlining. @xref{Inline Function Definitions}.
|
|
|
+@end table
|
|
|
+
|
|
|
+For full documentation of attributes, see the GCC manual.
|
|
|
+@xref{Attribute Syntax, Attribute Syntax, System Headers, gcc, Using
|
|
|
+the GNU Compiler Collection}.
|
|
|
+
|
|
|
+@node Signals
|
|
|
+@appendix Signals
|
|
|
+@cindex signal
|
|
|
+@cindex handler (for signal)
|
|
|
+@cindex @code{SIGSEGV}
|
|
|
+@cindex @code{SIGFPE}
|
|
|
+@cindex @code{SIGBUS}
|
|
|
+
|
|
|
+Some program operations bring about an error condition called a
|
|
|
+@dfn{signal}. These signals terminate the program, by default.
|
|
|
+
|
|
|
+There are various different kinds of signals, each with a name. We
|
|
|
+have seen several such error conditions through this manual:
|
|
|
+
|
|
|
+@table @code
|
|
|
+@item SIGSEGV
|
|
|
+This signal is generated when a program tries to read or write outside
|
|
|
+the memory that is allocated for it, or to write memory that can only
|
|
|
+be read. The name is an abbreviation for ``segmentation violation''.
|
|
|
+
|
|
|
+@item SIGFPE
|
|
|
+This signal indicates a fatal arithmetic error. The name is an
|
|
|
+abbreviation for ``floating-point exception'', but covers all types of
|
|
|
+arithmetic errors, including division by zero and overflow.
|
|
|
+
|
|
|
+@item SIGBUS
|
|
|
+This signal is generated when an invalid pointer is dereferenced,
|
|
|
+typically the result of dereferencing an uninintalized pointer. It is
|
|
|
+similar to @code{SIGSEGV}, except that @code{SIGSEGV} indicates
|
|
|
+invalid access to valid memory, while @code{SIGBUS} indicates an
|
|
|
+attempt to access an invalid address.
|
|
|
+@end table
|
|
|
+
|
|
|
+These kinds of signal allow the program to specify a function as a
|
|
|
+@dfn{signal handler}. When a signal has a handler, it doesn't
|
|
|
+terminate the program; instead it calls the handler.
|
|
|
+
|
|
|
+There are many other kinds of signal; here we list only those that
|
|
|
+come from run-time errors in C operations. The rest have to do with
|
|
|
+the functioning of the operating system. The GNU C Library Reference
|
|
|
+Manual gives more explanation about signals (@pxref{Program Signal
|
|
|
+Handling, The GNU C Library, , libc, The GNU C Library Reference
|
|
|
+Manual}).
|
|
|
+
|
|
|
+@node GNU Free Documentation License
|
|
|
+@appendix GNU Free Documentation License
|
|
|
+
|
|
|
+@include fdl.texi
|
|
|
+
|
|
|
+@node Symbol Index
|
|
|
+@unnumbered Index of Symbols and Keywords
|
|
|
+
|
|
|
+@printindex fn
|
|
|
+
|
|
|
+@node Concept Index
|
|
|
+@unnumbered Concept Index
|
|
|
+
|
|
|
+@printindex cp
|
|
|
+
|
|
|
+@bye
|