rev |
line source |
yuuji@0
|
1
|
yuuji@0
|
2
|
yuuji@0
|
3
|
yuuji@0
|
4
|
yuuji@0
|
5
|
yuuji@0
|
6
|
yuuji@0
|
7 Network Working Group D. Crocker, Ed.
|
yuuji@0
|
8 Request for Comments: 5234 Brandenburg InternetWorking
|
yuuji@0
|
9 STD: 68 P. Overell
|
yuuji@0
|
10 Obsoletes: 4234 THUS plc.
|
yuuji@0
|
11 Category: Standards Track January 2008
|
yuuji@0
|
12
|
yuuji@0
|
13
|
yuuji@0
|
14 Augmented BNF for Syntax Specifications: ABNF
|
yuuji@0
|
15
|
yuuji@0
|
16 Status of This Memo
|
yuuji@0
|
17
|
yuuji@0
|
18 This document specifies an Internet standards track protocol for the
|
yuuji@0
|
19 Internet community, and requests discussion and suggestions for
|
yuuji@0
|
20 improvements. Please refer to the current edition of the "Internet
|
yuuji@0
|
21 Official Protocol Standards" (STD 1) for the standardization state
|
yuuji@0
|
22 and status of this protocol. Distribution of this memo is unlimited.
|
yuuji@0
|
23
|
yuuji@0
|
24 Abstract
|
yuuji@0
|
25
|
yuuji@0
|
26 Internet technical specifications often need to define a formal
|
yuuji@0
|
27 syntax. Over the years, a modified version of Backus-Naur Form
|
yuuji@0
|
28 (BNF), called Augmented BNF (ABNF), has been popular among many
|
yuuji@0
|
29 Internet specifications. The current specification documents ABNF.
|
yuuji@0
|
30 It balances compactness and simplicity with reasonable
|
yuuji@0
|
31 representational power. The differences between standard BNF and
|
yuuji@0
|
32 ABNF involve naming rules, repetition, alternatives, order-
|
yuuji@0
|
33 independence, and value ranges. This specification also supplies
|
yuuji@0
|
34 additional rule definitions and encoding for a core lexical analyzer
|
yuuji@0
|
35 of the type common to several Internet specifications.
|
yuuji@0
|
36
|
yuuji@0
|
37
|
yuuji@0
|
38
|
yuuji@0
|
39
|
yuuji@0
|
40
|
yuuji@0
|
41
|
yuuji@0
|
42
|
yuuji@0
|
43
|
yuuji@0
|
44
|
yuuji@0
|
45
|
yuuji@0
|
46
|
yuuji@0
|
47
|
yuuji@0
|
48
|
yuuji@0
|
49
|
yuuji@0
|
50
|
yuuji@0
|
51
|
yuuji@0
|
52
|
yuuji@0
|
53
|
yuuji@0
|
54
|
yuuji@0
|
55
|
yuuji@0
|
56
|
yuuji@0
|
57
|
yuuji@0
|
58 Crocker & Overell Standards Track [Page 1]
|
yuuji@0
|
59
|
yuuji@0
|
60 RFC 5234 ABNF January 2008
|
yuuji@0
|
61
|
yuuji@0
|
62
|
yuuji@0
|
63 Table of Contents
|
yuuji@0
|
64
|
yuuji@0
|
65 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
|
yuuji@0
|
66 2. Rule Definition . . . . . . . . . . . . . . . . . . . . . . . 3
|
yuuji@0
|
67 2.1. Rule Naming . . . . . . . . . . . . . . . . . . . . . . . 3
|
yuuji@0
|
68 2.2. Rule Form . . . . . . . . . . . . . . . . . . . . . . . . 4
|
yuuji@0
|
69 2.3. Terminal Values . . . . . . . . . . . . . . . . . . . . . 4
|
yuuji@0
|
70 2.4. External Encodings . . . . . . . . . . . . . . . . . . . . 6
|
yuuji@0
|
71 3. Operators . . . . . . . . . . . . . . . . . . . . . . . . . . 6
|
yuuji@0
|
72 3.1. Concatenation: Rule1 Rule2 . . . . . . . . . . . . . . . 6
|
yuuji@0
|
73 3.2. Alternatives: Rule1 / Rule2 . . . . . . . . . . . . . . . 7
|
yuuji@0
|
74 3.3. Incremental Alternatives: Rule1 =/ Rule2 . . . . . . . . . 7
|
yuuji@0
|
75 3.4. Value Range Alternatives: %c##-## . . . . . . . . . . . . 8
|
yuuji@0
|
76 3.5. Sequence Group: (Rule1 Rule2) . . . . . . . . . . . . . . 8
|
yuuji@0
|
77 3.6. Variable Repetition: *Rule . . . . . . . . . . . . . . . 9
|
yuuji@0
|
78 3.7. Specific Repetition: nRule . . . . . . . . . . . . . . . 9
|
yuuji@0
|
79 3.8. Optional Sequence: [RULE] . . . . . . . . . . . . . . . . 9
|
yuuji@0
|
80 3.9. Comment: ; Comment . . . . . . . . . . . . . . . . . . . 9
|
yuuji@0
|
81 3.10. Operator Precedence . . . . . . . . . . . . . . . . . . . 10
|
yuuji@0
|
82 4. ABNF Definition of ABNF . . . . . . . . . . . . . . . . . . . 10
|
yuuji@0
|
83 5. Security Considerations . . . . . . . . . . . . . . . . . . . 12
|
yuuji@0
|
84 6. References . . . . . . . . . . . . . . . . . . . . . . . . . . 12
|
yuuji@0
|
85 6.1. Normative References . . . . . . . . . . . . . . . . . . . 12
|
yuuji@0
|
86 6.2. Informative References . . . . . . . . . . . . . . . . . . 12
|
yuuji@0
|
87 Appendix A. Acknowledgements . . . . . . . . . . . . . . . . . . 13
|
yuuji@0
|
88 Appendix B. Core ABNF of ABNF . . . . . . . . . . . . . . . . . . 13
|
yuuji@0
|
89 B.1. Core Rules . . . . . . . . . . . . . . . . . . . . . . . . 13
|
yuuji@0
|
90 B.2. Common Encoding . . . . . . . . . . . . . . . . . . . . . 15
|
yuuji@0
|
91
|
yuuji@0
|
92
|
yuuji@0
|
93
|
yuuji@0
|
94
|
yuuji@0
|
95
|
yuuji@0
|
96
|
yuuji@0
|
97
|
yuuji@0
|
98
|
yuuji@0
|
99
|
yuuji@0
|
100
|
yuuji@0
|
101
|
yuuji@0
|
102
|
yuuji@0
|
103
|
yuuji@0
|
104
|
yuuji@0
|
105
|
yuuji@0
|
106
|
yuuji@0
|
107
|
yuuji@0
|
108
|
yuuji@0
|
109
|
yuuji@0
|
110
|
yuuji@0
|
111
|
yuuji@0
|
112
|
yuuji@0
|
113
|
yuuji@0
|
114 Crocker & Overell Standards Track [Page 2]
|
yuuji@0
|
115
|
yuuji@0
|
116 RFC 5234 ABNF January 2008
|
yuuji@0
|
117
|
yuuji@0
|
118
|
yuuji@0
|
119 1. Introduction
|
yuuji@0
|
120
|
yuuji@0
|
121 Internet technical specifications often need to define a formal
|
yuuji@0
|
122 syntax and are free to employ whatever notation their authors deem
|
yuuji@0
|
123 useful. Over the years, a modified version of Backus-Naur Form
|
yuuji@0
|
124 (BNF), called Augmented BNF (ABNF), has been popular among many
|
yuuji@0
|
125 Internet specifications. It balances compactness and simplicity with
|
yuuji@0
|
126 reasonable representational power. In the early days of the Arpanet,
|
yuuji@0
|
127 each specification contained its own definition of ABNF. This
|
yuuji@0
|
128 included the email specifications, [RFC733] and then [RFC822], which
|
yuuji@0
|
129 came to be the common citations for defining ABNF. The current
|
yuuji@0
|
130 document separates those definitions to permit selective reference.
|
yuuji@0
|
131 Predictably, it also provides some modifications and enhancements.
|
yuuji@0
|
132
|
yuuji@0
|
133 The differences between standard BNF and ABNF involve naming rules,
|
yuuji@0
|
134 repetition, alternatives, order-independence, and value ranges.
|
yuuji@0
|
135 Appendix B supplies rule definitions and encoding for a core lexical
|
yuuji@0
|
136 analyzer of the type common to several Internet specifications. It
|
yuuji@0
|
137 is provided as a convenience and is otherwise separate from the meta
|
yuuji@0
|
138 language defined in the body of this document, and separate from its
|
yuuji@0
|
139 formal status.
|
yuuji@0
|
140
|
yuuji@0
|
141 2. Rule Definition
|
yuuji@0
|
142
|
yuuji@0
|
143 2.1. Rule Naming
|
yuuji@0
|
144
|
yuuji@0
|
145 The name of a rule is simply the name itself, that is, a sequence of
|
yuuji@0
|
146 characters, beginning with an alphabetic character, and followed by a
|
yuuji@0
|
147 combination of alphabetics, digits, and hyphens (dashes).
|
yuuji@0
|
148
|
yuuji@0
|
149 NOTE:
|
yuuji@0
|
150
|
yuuji@0
|
151 Rule names are case insensitive.
|
yuuji@0
|
152
|
yuuji@0
|
153 The names <rulename>, <Rulename>, <RULENAME>, and <rUlENamE> all
|
yuuji@0
|
154 refer to the same rule.
|
yuuji@0
|
155
|
yuuji@0
|
156 Unlike original BNF, angle brackets ("<", ">") are not required.
|
yuuji@0
|
157 However, angle brackets may be used around a rule name whenever their
|
yuuji@0
|
158 presence facilitates in discerning the use of a rule name. This is
|
yuuji@0
|
159 typically restricted to rule name references in free-form prose, or
|
yuuji@0
|
160 to distinguish partial rules that combine into a string not separated
|
yuuji@0
|
161 by white space, such as shown in the discussion about repetition,
|
yuuji@0
|
162 below.
|
yuuji@0
|
163
|
yuuji@0
|
164
|
yuuji@0
|
165
|
yuuji@0
|
166
|
yuuji@0
|
167
|
yuuji@0
|
168
|
yuuji@0
|
169
|
yuuji@0
|
170 Crocker & Overell Standards Track [Page 3]
|
yuuji@0
|
171
|
yuuji@0
|
172 RFC 5234 ABNF January 2008
|
yuuji@0
|
173
|
yuuji@0
|
174
|
yuuji@0
|
175 2.2. Rule Form
|
yuuji@0
|
176
|
yuuji@0
|
177 A rule is defined by the following sequence:
|
yuuji@0
|
178
|
yuuji@0
|
179 name = elements crlf
|
yuuji@0
|
180
|
yuuji@0
|
181 where <name> is the name of the rule, <elements> is one or more rule
|
yuuji@0
|
182 names or terminal specifications, and <crlf> is the end-of-line
|
yuuji@0
|
183 indicator (carriage return followed by line feed). The equal sign
|
yuuji@0
|
184 separates the name from the definition of the rule. The elements
|
yuuji@0
|
185 form a sequence of one or more rule names and/or value definitions,
|
yuuji@0
|
186 combined according to the various operators defined in this document,
|
yuuji@0
|
187 such as alternative and repetition.
|
yuuji@0
|
188
|
yuuji@0
|
189 For visual ease, rule definitions are left aligned. When a rule
|
yuuji@0
|
190 requires multiple lines, the continuation lines are indented. The
|
yuuji@0
|
191 left alignment and indentation are relative to the first lines of the
|
yuuji@0
|
192 ABNF rules and need not match the left margin of the document.
|
yuuji@0
|
193
|
yuuji@0
|
194 2.3. Terminal Values
|
yuuji@0
|
195
|
yuuji@0
|
196 Rules resolve into a string of terminal values, sometimes called
|
yuuji@0
|
197 characters. In ABNF, a character is merely a non-negative integer.
|
yuuji@0
|
198 In certain contexts, a specific mapping (encoding) of values into a
|
yuuji@0
|
199 character set (such as ASCII) will be specified.
|
yuuji@0
|
200
|
yuuji@0
|
201 Terminals are specified by one or more numeric characters, with the
|
yuuji@0
|
202 base interpretation of those characters indicated explicitly. The
|
yuuji@0
|
203 following bases are currently defined:
|
yuuji@0
|
204
|
yuuji@0
|
205 b = binary
|
yuuji@0
|
206
|
yuuji@0
|
207 d = decimal
|
yuuji@0
|
208
|
yuuji@0
|
209 x = hexadecimal
|
yuuji@0
|
210
|
yuuji@0
|
211 Hence:
|
yuuji@0
|
212
|
yuuji@0
|
213 CR = %d13
|
yuuji@0
|
214
|
yuuji@0
|
215 CR = %x0D
|
yuuji@0
|
216
|
yuuji@0
|
217 respectively specify the decimal and hexadecimal representation of
|
yuuji@0
|
218 [US-ASCII] for carriage return.
|
yuuji@0
|
219
|
yuuji@0
|
220
|
yuuji@0
|
221
|
yuuji@0
|
222
|
yuuji@0
|
223
|
yuuji@0
|
224
|
yuuji@0
|
225
|
yuuji@0
|
226 Crocker & Overell Standards Track [Page 4]
|
yuuji@0
|
227
|
yuuji@0
|
228 RFC 5234 ABNF January 2008
|
yuuji@0
|
229
|
yuuji@0
|
230
|
yuuji@0
|
231 A concatenated string of such values is specified compactly, using a
|
yuuji@0
|
232 period (".") to indicate a separation of characters within that
|
yuuji@0
|
233 value. Hence:
|
yuuji@0
|
234
|
yuuji@0
|
235 CRLF = %d13.10
|
yuuji@0
|
236
|
yuuji@0
|
237 ABNF permits the specification of literal text strings directly,
|
yuuji@0
|
238 enclosed in quotation marks. Hence:
|
yuuji@0
|
239
|
yuuji@0
|
240 command = "command string"
|
yuuji@0
|
241
|
yuuji@0
|
242 Literal text strings are interpreted as a concatenated set of
|
yuuji@0
|
243 printable characters.
|
yuuji@0
|
244
|
yuuji@0
|
245 NOTE:
|
yuuji@0
|
246
|
yuuji@0
|
247 ABNF strings are case insensitive and the character set for these
|
yuuji@0
|
248 strings is US-ASCII.
|
yuuji@0
|
249
|
yuuji@0
|
250 Hence:
|
yuuji@0
|
251
|
yuuji@0
|
252 rulename = "abc"
|
yuuji@0
|
253
|
yuuji@0
|
254 and:
|
yuuji@0
|
255
|
yuuji@0
|
256 rulename = "aBc"
|
yuuji@0
|
257
|
yuuji@0
|
258 will match "abc", "Abc", "aBc", "abC", "ABc", "aBC", "AbC", and
|
yuuji@0
|
259 "ABC".
|
yuuji@0
|
260
|
yuuji@0
|
261 To specify a rule that is case sensitive, specify the characters
|
yuuji@0
|
262 individually.
|
yuuji@0
|
263
|
yuuji@0
|
264 For example:
|
yuuji@0
|
265
|
yuuji@0
|
266 rulename = %d97 %d98 %d99
|
yuuji@0
|
267
|
yuuji@0
|
268 or
|
yuuji@0
|
269
|
yuuji@0
|
270 rulename = %d97.98.99
|
yuuji@0
|
271
|
yuuji@0
|
272 will match only the string that comprises only the lowercase
|
yuuji@0
|
273 characters, abc.
|
yuuji@0
|
274
|
yuuji@0
|
275
|
yuuji@0
|
276
|
yuuji@0
|
277
|
yuuji@0
|
278
|
yuuji@0
|
279
|
yuuji@0
|
280
|
yuuji@0
|
281
|
yuuji@0
|
282 Crocker & Overell Standards Track [Page 5]
|
yuuji@0
|
283
|
yuuji@0
|
284 RFC 5234 ABNF January 2008
|
yuuji@0
|
285
|
yuuji@0
|
286
|
yuuji@0
|
287 2.4. External Encodings
|
yuuji@0
|
288
|
yuuji@0
|
289 External representations of terminal value characters will vary
|
yuuji@0
|
290 according to constraints in the storage or transmission environment.
|
yuuji@0
|
291 Hence, the same ABNF-based grammar may have multiple external
|
yuuji@0
|
292 encodings, such as one for a 7-bit US-ASCII environment, another for
|
yuuji@0
|
293 a binary octet environment, and still a different one when 16-bit
|
yuuji@0
|
294 Unicode is used. Encoding details are beyond the scope of ABNF,
|
yuuji@0
|
295 although Appendix B provides definitions for a 7-bit US-ASCII
|
yuuji@0
|
296 environment as has been common to much of the Internet.
|
yuuji@0
|
297
|
yuuji@0
|
298 By separating external encoding from the syntax, it is intended that
|
yuuji@0
|
299 alternate encoding environments can be used for the same syntax.
|
yuuji@0
|
300
|
yuuji@0
|
301 3. Operators
|
yuuji@0
|
302
|
yuuji@0
|
303 3.1. Concatenation: Rule1 Rule2
|
yuuji@0
|
304
|
yuuji@0
|
305 A rule can define a simple, ordered string of values (i.e., a
|
yuuji@0
|
306 concatenation of contiguous characters) by listing a sequence of rule
|
yuuji@0
|
307 names. For example:
|
yuuji@0
|
308
|
yuuji@0
|
309 foo = %x61 ; a
|
yuuji@0
|
310
|
yuuji@0
|
311 bar = %x62 ; b
|
yuuji@0
|
312
|
yuuji@0
|
313 mumble = foo bar foo
|
yuuji@0
|
314
|
yuuji@0
|
315 So that the rule <mumble> matches the lowercase string "aba".
|
yuuji@0
|
316
|
yuuji@0
|
317 Linear white space: Concatenation is at the core of the ABNF parsing
|
yuuji@0
|
318 model. A string of contiguous characters (values) is parsed
|
yuuji@0
|
319 according to the rules defined in ABNF. For Internet specifications,
|
yuuji@0
|
320 there is some history of permitting linear white space (space and
|
yuuji@0
|
321 horizontal tab) to be freely and implicitly interspersed around major
|
yuuji@0
|
322 constructs, such as delimiting special characters or atomic strings.
|
yuuji@0
|
323
|
yuuji@0
|
324 NOTE:
|
yuuji@0
|
325
|
yuuji@0
|
326 This specification for ABNF does not provide for implicit
|
yuuji@0
|
327 specification of linear white space.
|
yuuji@0
|
328
|
yuuji@0
|
329 Any grammar that wishes to permit linear white space around
|
yuuji@0
|
330 delimiters or string segments must specify it explicitly. It is
|
yuuji@0
|
331 often useful to provide for such white space in "core" rules that are
|
yuuji@0
|
332 then used variously among higher-level rules. The "core" rules might
|
yuuji@0
|
333 be formed into a lexical analyzer or simply be part of the main
|
yuuji@0
|
334 ruleset.
|
yuuji@0
|
335
|
yuuji@0
|
336
|
yuuji@0
|
337
|
yuuji@0
|
338 Crocker & Overell Standards Track [Page 6]
|
yuuji@0
|
339
|
yuuji@0
|
340 RFC 5234 ABNF January 2008
|
yuuji@0
|
341
|
yuuji@0
|
342
|
yuuji@0
|
343 3.2. Alternatives: Rule1 / Rule2
|
yuuji@0
|
344
|
yuuji@0
|
345 Elements separated by a forward slash ("/") are alternatives.
|
yuuji@0
|
346 Therefore,
|
yuuji@0
|
347
|
yuuji@0
|
348 foo / bar
|
yuuji@0
|
349
|
yuuji@0
|
350 will accept <foo> or <bar>.
|
yuuji@0
|
351
|
yuuji@0
|
352 NOTE:
|
yuuji@0
|
353
|
yuuji@0
|
354 A quoted string containing alphabetic characters is a special form
|
yuuji@0
|
355 for specifying alternative characters and is interpreted as a non-
|
yuuji@0
|
356 terminal representing the set of combinatorial strings with the
|
yuuji@0
|
357 contained characters, in the specified order but with any mixture
|
yuuji@0
|
358 of upper- and lowercase.
|
yuuji@0
|
359
|
yuuji@0
|
360 3.3. Incremental Alternatives: Rule1 =/ Rule2
|
yuuji@0
|
361
|
yuuji@0
|
362 It is sometimes convenient to specify a list of alternatives in
|
yuuji@0
|
363 fragments. That is, an initial rule may match one or more
|
yuuji@0
|
364 alternatives, with later rule definitions adding to the set of
|
yuuji@0
|
365 alternatives. This is particularly useful for otherwise independent
|
yuuji@0
|
366 specifications that derive from the same parent ruleset, such as
|
yuuji@0
|
367 often occurs with parameter lists. ABNF permits this incremental
|
yuuji@0
|
368 definition through the construct:
|
yuuji@0
|
369
|
yuuji@0
|
370 oldrule =/ additional-alternatives
|
yuuji@0
|
371
|
yuuji@0
|
372 So that the ruleset
|
yuuji@0
|
373
|
yuuji@0
|
374 ruleset = alt1 / alt2
|
yuuji@0
|
375
|
yuuji@0
|
376 ruleset =/ alt3
|
yuuji@0
|
377
|
yuuji@0
|
378 ruleset =/ alt4 / alt5
|
yuuji@0
|
379
|
yuuji@0
|
380 is the same as specifying
|
yuuji@0
|
381
|
yuuji@0
|
382 ruleset = alt1 / alt2 / alt3 / alt4 / alt5
|
yuuji@0
|
383
|
yuuji@0
|
384
|
yuuji@0
|
385
|
yuuji@0
|
386
|
yuuji@0
|
387
|
yuuji@0
|
388
|
yuuji@0
|
389
|
yuuji@0
|
390
|
yuuji@0
|
391
|
yuuji@0
|
392
|
yuuji@0
|
393
|
yuuji@0
|
394 Crocker & Overell Standards Track [Page 7]
|
yuuji@0
|
395
|
yuuji@0
|
396 RFC 5234 ABNF January 2008
|
yuuji@0
|
397
|
yuuji@0
|
398
|
yuuji@0
|
399 3.4. Value Range Alternatives: %c##-##
|
yuuji@0
|
400
|
yuuji@0
|
401 A range of alternative numeric values can be specified compactly,
|
yuuji@0
|
402 using a dash ("-") to indicate the range of alternative values.
|
yuuji@0
|
403 Hence:
|
yuuji@0
|
404
|
yuuji@0
|
405 DIGIT = %x30-39
|
yuuji@0
|
406
|
yuuji@0
|
407 is equivalent to:
|
yuuji@0
|
408
|
yuuji@0
|
409 DIGIT = "0" / "1" / "2" / "3" / "4" / "5" / "6" /
|
yuuji@0
|
410
|
yuuji@0
|
411 "7" / "8" / "9"
|
yuuji@0
|
412
|
yuuji@0
|
413 Concatenated numeric values and numeric value ranges cannot be
|
yuuji@0
|
414 specified in the same string. A numeric value may use the dotted
|
yuuji@0
|
415 notation for concatenation or it may use the dash notation to specify
|
yuuji@0
|
416 one value range. Hence, to specify one printable character between
|
yuuji@0
|
417 end-of-line sequences, the specification could be:
|
yuuji@0
|
418
|
yuuji@0
|
419 char-line = %x0D.0A %x20-7E %x0D.0A
|
yuuji@0
|
420
|
yuuji@0
|
421 3.5. Sequence Group: (Rule1 Rule2)
|
yuuji@0
|
422
|
yuuji@0
|
423 Elements enclosed in parentheses are treated as a single element,
|
yuuji@0
|
424 whose contents are strictly ordered. Thus,
|
yuuji@0
|
425
|
yuuji@0
|
426 elem (foo / bar) blat
|
yuuji@0
|
427
|
yuuji@0
|
428 matches (elem foo blat) or (elem bar blat), and
|
yuuji@0
|
429
|
yuuji@0
|
430 elem foo / bar blat
|
yuuji@0
|
431
|
yuuji@0
|
432 matches (elem foo) or (bar blat).
|
yuuji@0
|
433
|
yuuji@0
|
434 NOTE:
|
yuuji@0
|
435
|
yuuji@0
|
436 It is strongly advised that grouping notation be used, rather than
|
yuuji@0
|
437 relying on the proper reading of "bare" alternations, when
|
yuuji@0
|
438 alternatives consist of multiple rule names or literals.
|
yuuji@0
|
439
|
yuuji@0
|
440 Hence, it is recommended that the following form be used:
|
yuuji@0
|
441
|
yuuji@0
|
442 (elem foo) / (bar blat)
|
yuuji@0
|
443
|
yuuji@0
|
444 It will avoid misinterpretation by casual readers.
|
yuuji@0
|
445
|
yuuji@0
|
446
|
yuuji@0
|
447
|
yuuji@0
|
448
|
yuuji@0
|
449
|
yuuji@0
|
450 Crocker & Overell Standards Track [Page 8]
|
yuuji@0
|
451
|
yuuji@0
|
452 RFC 5234 ABNF January 2008
|
yuuji@0
|
453
|
yuuji@0
|
454
|
yuuji@0
|
455 The sequence group notation is also used within free text to set off
|
yuuji@0
|
456 an element sequence from the prose.
|
yuuji@0
|
457
|
yuuji@0
|
458 3.6. Variable Repetition: *Rule
|
yuuji@0
|
459
|
yuuji@0
|
460 The operator "*" preceding an element indicates repetition. The full
|
yuuji@0
|
461 form is:
|
yuuji@0
|
462
|
yuuji@0
|
463 <a>*<b>element
|
yuuji@0
|
464
|
yuuji@0
|
465 where <a> and <b> are optional decimal values, indicating at least
|
yuuji@0
|
466 <a> and at most <b> occurrences of the element.
|
yuuji@0
|
467
|
yuuji@0
|
468 Default values are 0 and infinity so that *<element> allows any
|
yuuji@0
|
469 number, including zero; 1*<element> requires at least one;
|
yuuji@0
|
470 3*3<element> allows exactly 3; and 1*2<element> allows one or two.
|
yuuji@0
|
471
|
yuuji@0
|
472 3.7. Specific Repetition: nRule
|
yuuji@0
|
473
|
yuuji@0
|
474 A rule of the form:
|
yuuji@0
|
475
|
yuuji@0
|
476 <n>element
|
yuuji@0
|
477
|
yuuji@0
|
478 is equivalent to
|
yuuji@0
|
479
|
yuuji@0
|
480 <n>*<n>element
|
yuuji@0
|
481
|
yuuji@0
|
482 That is, exactly <n> occurrences of <element>. Thus, 2DIGIT is a
|
yuuji@0
|
483 2-digit number, and 3ALPHA is a string of three alphabetic
|
yuuji@0
|
484 characters.
|
yuuji@0
|
485
|
yuuji@0
|
486 3.8. Optional Sequence: [RULE]
|
yuuji@0
|
487
|
yuuji@0
|
488 Square brackets enclose an optional element sequence:
|
yuuji@0
|
489
|
yuuji@0
|
490 [foo bar]
|
yuuji@0
|
491
|
yuuji@0
|
492 is equivalent to
|
yuuji@0
|
493
|
yuuji@0
|
494 *1(foo bar).
|
yuuji@0
|
495
|
yuuji@0
|
496 3.9. Comment: ; Comment
|
yuuji@0
|
497
|
yuuji@0
|
498 A semicolon starts a comment that continues to the end of line. This
|
yuuji@0
|
499 is a simple way of including useful notes in parallel with the
|
yuuji@0
|
500 specifications.
|
yuuji@0
|
501
|
yuuji@0
|
502
|
yuuji@0
|
503
|
yuuji@0
|
504
|
yuuji@0
|
505
|
yuuji@0
|
506 Crocker & Overell Standards Track [Page 9]
|
yuuji@0
|
507
|
yuuji@0
|
508 RFC 5234 ABNF January 2008
|
yuuji@0
|
509
|
yuuji@0
|
510
|
yuuji@0
|
511 3.10. Operator Precedence
|
yuuji@0
|
512
|
yuuji@0
|
513 The various mechanisms described above have the following precedence,
|
yuuji@0
|
514 from highest (binding tightest) at the top, to lowest (loosest) at
|
yuuji@0
|
515 the bottom:
|
yuuji@0
|
516
|
yuuji@0
|
517 Rule name, prose-val, Terminal value
|
yuuji@0
|
518
|
yuuji@0
|
519 Comment
|
yuuji@0
|
520
|
yuuji@0
|
521 Value range
|
yuuji@0
|
522
|
yuuji@0
|
523 Repetition
|
yuuji@0
|
524
|
yuuji@0
|
525 Grouping, Optional
|
yuuji@0
|
526
|
yuuji@0
|
527 Concatenation
|
yuuji@0
|
528
|
yuuji@0
|
529 Alternative
|
yuuji@0
|
530
|
yuuji@0
|
531 Use of the alternative operator, freely mixed with concatenations,
|
yuuji@0
|
532 can be confusing.
|
yuuji@0
|
533
|
yuuji@0
|
534 Again, it is recommended that the grouping operator be used to
|
yuuji@0
|
535 make explicit concatenation groups.
|
yuuji@0
|
536
|
yuuji@0
|
537 4. ABNF Definition of ABNF
|
yuuji@0
|
538
|
yuuji@0
|
539 NOTES:
|
yuuji@0
|
540
|
yuuji@0
|
541 1. This syntax requires a formatting of rules that is relatively
|
yuuji@0
|
542 strict. Hence, the version of a ruleset included in a
|
yuuji@0
|
543 specification might need preprocessing to ensure that it can
|
yuuji@0
|
544 be interpreted by an ABNF parser.
|
yuuji@0
|
545
|
yuuji@0
|
546 2. This syntax uses the rules provided in Appendix B.
|
yuuji@0
|
547
|
yuuji@0
|
548
|
yuuji@0
|
549 rulelist = 1*( rule / (*c-wsp c-nl) )
|
yuuji@0
|
550
|
yuuji@0
|
551 rule = rulename defined-as elements c-nl
|
yuuji@0
|
552 ; continues if next line starts
|
yuuji@0
|
553 ; with white space
|
yuuji@0
|
554
|
yuuji@0
|
555 rulename = ALPHA *(ALPHA / DIGIT / "-")
|
yuuji@0
|
556
|
yuuji@0
|
557
|
yuuji@0
|
558
|
yuuji@0
|
559
|
yuuji@0
|
560
|
yuuji@0
|
561
|
yuuji@0
|
562 Crocker & Overell Standards Track [Page 10]
|
yuuji@0
|
563
|
yuuji@0
|
564 RFC 5234 ABNF January 2008
|
yuuji@0
|
565
|
yuuji@0
|
566
|
yuuji@0
|
567 defined-as = *c-wsp ("=" / "=/") *c-wsp
|
yuuji@0
|
568 ; basic rules definition and
|
yuuji@0
|
569 ; incremental alternatives
|
yuuji@0
|
570
|
yuuji@0
|
571 elements = alternation *c-wsp
|
yuuji@0
|
572
|
yuuji@0
|
573 c-wsp = WSP / (c-nl WSP)
|
yuuji@0
|
574
|
yuuji@0
|
575 c-nl = comment / CRLF
|
yuuji@0
|
576 ; comment or newline
|
yuuji@0
|
577
|
yuuji@0
|
578 comment = ";" *(WSP / VCHAR) CRLF
|
yuuji@0
|
579
|
yuuji@0
|
580 alternation = concatenation
|
yuuji@0
|
581 *(*c-wsp "/" *c-wsp concatenation)
|
yuuji@0
|
582
|
yuuji@0
|
583 concatenation = repetition *(1*c-wsp repetition)
|
yuuji@0
|
584
|
yuuji@0
|
585 repetition = [repeat] element
|
yuuji@0
|
586
|
yuuji@0
|
587 repeat = 1*DIGIT / (*DIGIT "*" *DIGIT)
|
yuuji@0
|
588
|
yuuji@0
|
589 element = rulename / group / option /
|
yuuji@0
|
590 char-val / num-val / prose-val
|
yuuji@0
|
591
|
yuuji@0
|
592 group = "(" *c-wsp alternation *c-wsp ")"
|
yuuji@0
|
593
|
yuuji@0
|
594 option = "[" *c-wsp alternation *c-wsp "]"
|
yuuji@0
|
595
|
yuuji@0
|
596 char-val = DQUOTE *(%x20-21 / %x23-7E) DQUOTE
|
yuuji@0
|
597 ; quoted string of SP and VCHAR
|
yuuji@0
|
598 ; without DQUOTE
|
yuuji@0
|
599
|
yuuji@0
|
600 num-val = "%" (bin-val / dec-val / hex-val)
|
yuuji@0
|
601
|
yuuji@0
|
602 bin-val = "b" 1*BIT
|
yuuji@0
|
603 [ 1*("." 1*BIT) / ("-" 1*BIT) ]
|
yuuji@0
|
604 ; series of concatenated bit values
|
yuuji@0
|
605 ; or single ONEOF range
|
yuuji@0
|
606
|
yuuji@0
|
607 dec-val = "d" 1*DIGIT
|
yuuji@0
|
608 [ 1*("." 1*DIGIT) / ("-" 1*DIGIT) ]
|
yuuji@0
|
609
|
yuuji@0
|
610 hex-val = "x" 1*HEXDIG
|
yuuji@0
|
611 [ 1*("." 1*HEXDIG) / ("-" 1*HEXDIG) ]
|
yuuji@0
|
612
|
yuuji@0
|
613
|
yuuji@0
|
614
|
yuuji@0
|
615
|
yuuji@0
|
616
|
yuuji@0
|
617
|
yuuji@0
|
618 Crocker & Overell Standards Track [Page 11]
|
yuuji@0
|
619
|
yuuji@0
|
620 RFC 5234 ABNF January 2008
|
yuuji@0
|
621
|
yuuji@0
|
622
|
yuuji@0
|
623 prose-val = "<" *(%x20-3D / %x3F-7E) ">"
|
yuuji@0
|
624 ; bracketed string of SP and VCHAR
|
yuuji@0
|
625 ; without angles
|
yuuji@0
|
626 ; prose description, to be used as
|
yuuji@0
|
627 ; last resort
|
yuuji@0
|
628
|
yuuji@0
|
629 5. Security Considerations
|
yuuji@0
|
630
|
yuuji@0
|
631 Security is truly believed to be irrelevant to this document.
|
yuuji@0
|
632
|
yuuji@0
|
633 6. References
|
yuuji@0
|
634
|
yuuji@0
|
635 6.1. Normative References
|
yuuji@0
|
636
|
yuuji@0
|
637 [US-ASCII] American National Standards Institute, "Coded Character
|
yuuji@0
|
638 Set -- 7-bit American Standard Code for Information
|
yuuji@0
|
639 Interchange", ANSI X3.4, 1986.
|
yuuji@0
|
640
|
yuuji@0
|
641 6.2. Informative References
|
yuuji@0
|
642
|
yuuji@0
|
643 [RFC733] Crocker, D., Vittal, J., Pogran, K., and D. Henderson,
|
yuuji@0
|
644 "Standard for the format of ARPA network text messages",
|
yuuji@0
|
645 RFC 733, November 1977.
|
yuuji@0
|
646
|
yuuji@0
|
647 [RFC822] Crocker, D., "Standard for the format of ARPA Internet
|
yuuji@0
|
648 text messages", STD 11, RFC 822, August 1982.
|
yuuji@0
|
649
|
yuuji@0
|
650
|
yuuji@0
|
651
|
yuuji@0
|
652
|
yuuji@0
|
653
|
yuuji@0
|
654
|
yuuji@0
|
655
|
yuuji@0
|
656
|
yuuji@0
|
657
|
yuuji@0
|
658
|
yuuji@0
|
659
|
yuuji@0
|
660
|
yuuji@0
|
661
|
yuuji@0
|
662
|
yuuji@0
|
663
|
yuuji@0
|
664
|
yuuji@0
|
665
|
yuuji@0
|
666
|
yuuji@0
|
667
|
yuuji@0
|
668
|
yuuji@0
|
669
|
yuuji@0
|
670
|
yuuji@0
|
671
|
yuuji@0
|
672
|
yuuji@0
|
673
|
yuuji@0
|
674 Crocker & Overell Standards Track [Page 12]
|
yuuji@0
|
675
|
yuuji@0
|
676 RFC 5234 ABNF January 2008
|
yuuji@0
|
677
|
yuuji@0
|
678
|
yuuji@0
|
679 Appendix A. Acknowledgements
|
yuuji@0
|
680
|
yuuji@0
|
681 The syntax for ABNF was originally specified in RFC 733. Ken L.
|
yuuji@0
|
682 Harrenstien, of SRI International, was responsible for re-coding the
|
yuuji@0
|
683 BNF into an Augmented BNF that makes the representation smaller and
|
yuuji@0
|
684 easier to understand.
|
yuuji@0
|
685
|
yuuji@0
|
686 This recent project began as a simple effort to cull out the portion
|
yuuji@0
|
687 of RFC 822 that has been repeatedly cited by non-email specification
|
yuuji@0
|
688 writers, namely the description of Augmented BNF. Rather than simply
|
yuuji@0
|
689 and blindly converting the existing text into a separate document,
|
yuuji@0
|
690 the working group chose to give careful consideration to the
|
yuuji@0
|
691 deficiencies, as well as benefits, of the existing specification and
|
yuuji@0
|
692 related specifications made available over the last 15 years, and
|
yuuji@0
|
693 therefore to pursue enhancement. This turned the project into
|
yuuji@0
|
694 something rather more ambitious than was first intended.
|
yuuji@0
|
695 Interestingly, the result is not massively different from that
|
yuuji@0
|
696 original, although decisions, such as removing the list notation,
|
yuuji@0
|
697 came as a surprise.
|
yuuji@0
|
698
|
yuuji@0
|
699 This "separated" version of the specification was part of the DRUMS
|
yuuji@0
|
700 working group, with significant contributions from Jerome Abela,
|
yuuji@0
|
701 Harald Alvestrand, Robert Elz, Roger Fajman, Aviva Garrett, Tom
|
yuuji@0
|
702 Harsch, Dan Kohn, Bill McQuillan, Keith Moore, Chris Newman, Pete
|
yuuji@0
|
703 Resnick, and Henning Schulzrinne.
|
yuuji@0
|
704
|
yuuji@0
|
705 Julian Reschke warrants a special thanks for converting the Draft
|
yuuji@0
|
706 Standard version to XML source form.
|
yuuji@0
|
707
|
yuuji@0
|
708 Appendix B. Core ABNF of ABNF
|
yuuji@0
|
709
|
yuuji@0
|
710 This appendix contains some basic rules that are in common use.
|
yuuji@0
|
711 Basic rules are in uppercase. Note that these rules are only valid
|
yuuji@0
|
712 for ABNF encoded in 7-bit ASCII or in characters sets that are a
|
yuuji@0
|
713 superset of 7-bit ASCII.
|
yuuji@0
|
714
|
yuuji@0
|
715 B.1. Core Rules
|
yuuji@0
|
716
|
yuuji@0
|
717 Certain basic rules are in uppercase, such as SP, HTAB, CRLF, DIGIT,
|
yuuji@0
|
718 ALPHA, etc.
|
yuuji@0
|
719
|
yuuji@0
|
720 ALPHA = %x41-5A / %x61-7A ; A-Z / a-z
|
yuuji@0
|
721
|
yuuji@0
|
722 BIT = "0" / "1"
|
yuuji@0
|
723
|
yuuji@0
|
724 CHAR = %x01-7F
|
yuuji@0
|
725 ; any 7-bit US-ASCII character,
|
yuuji@0
|
726 ; excluding NUL
|
yuuji@0
|
727
|
yuuji@0
|
728
|
yuuji@0
|
729
|
yuuji@0
|
730 Crocker & Overell Standards Track [Page 13]
|
yuuji@0
|
731
|
yuuji@0
|
732 RFC 5234 ABNF January 2008
|
yuuji@0
|
733
|
yuuji@0
|
734
|
yuuji@0
|
735 CR = %x0D
|
yuuji@0
|
736 ; carriage return
|
yuuji@0
|
737
|
yuuji@0
|
738 CRLF = CR LF
|
yuuji@0
|
739 ; Internet standard newline
|
yuuji@0
|
740
|
yuuji@0
|
741 CTL = %x00-1F / %x7F
|
yuuji@0
|
742 ; controls
|
yuuji@0
|
743
|
yuuji@0
|
744 DIGIT = %x30-39
|
yuuji@0
|
745 ; 0-9
|
yuuji@0
|
746
|
yuuji@0
|
747 DQUOTE = %x22
|
yuuji@0
|
748 ; " (Double Quote)
|
yuuji@0
|
749
|
yuuji@0
|
750 HEXDIG = DIGIT / "A" / "B" / "C" / "D" / "E" / "F"
|
yuuji@0
|
751
|
yuuji@0
|
752 HTAB = %x09
|
yuuji@0
|
753 ; horizontal tab
|
yuuji@0
|
754
|
yuuji@0
|
755 LF = %x0A
|
yuuji@0
|
756 ; linefeed
|
yuuji@0
|
757
|
yuuji@0
|
758 LWSP = *(WSP / CRLF WSP)
|
yuuji@0
|
759 ; Use of this linear-white-space rule
|
yuuji@0
|
760 ; permits lines containing only white
|
yuuji@0
|
761 ; space that are no longer legal in
|
yuuji@0
|
762 ; mail headers and have caused
|
yuuji@0
|
763 ; interoperability problems in other
|
yuuji@0
|
764 ; contexts.
|
yuuji@0
|
765 ; Do not use when defining mail
|
yuuji@0
|
766 ; headers and use with caution in
|
yuuji@0
|
767 ; other contexts.
|
yuuji@0
|
768
|
yuuji@0
|
769 OCTET = %x00-FF
|
yuuji@0
|
770 ; 8 bits of data
|
yuuji@0
|
771
|
yuuji@0
|
772 SP = %x20
|
yuuji@0
|
773
|
yuuji@0
|
774 VCHAR = %x21-7E
|
yuuji@0
|
775 ; visible (printing) characters
|
yuuji@0
|
776
|
yuuji@0
|
777 WSP = SP / HTAB
|
yuuji@0
|
778 ; white space
|
yuuji@0
|
779
|
yuuji@0
|
780
|
yuuji@0
|
781
|
yuuji@0
|
782
|
yuuji@0
|
783
|
yuuji@0
|
784
|
yuuji@0
|
785
|
yuuji@0
|
786 Crocker & Overell Standards Track [Page 14]
|
yuuji@0
|
787
|
yuuji@0
|
788 RFC 5234 ABNF January 2008
|
yuuji@0
|
789
|
yuuji@0
|
790
|
yuuji@0
|
791 B.2. Common Encoding
|
yuuji@0
|
792
|
yuuji@0
|
793 Externally, data are represented as "network virtual ASCII" (namely,
|
yuuji@0
|
794 7-bit US-ASCII in an 8-bit field), with the high (8th) bit set to
|
yuuji@0
|
795 zero. A string of values is in "network byte order", in which the
|
yuuji@0
|
796 higher-valued bytes are represented on the left-hand side and are
|
yuuji@0
|
797 sent over the network first.
|
yuuji@0
|
798
|
yuuji@0
|
799 Authors' Addresses
|
yuuji@0
|
800
|
yuuji@0
|
801 Dave Crocker (editor)
|
yuuji@0
|
802 Brandenburg InternetWorking
|
yuuji@0
|
803 675 Spruce Dr.
|
yuuji@0
|
804 Sunnyvale, CA 94086
|
yuuji@0
|
805 US
|
yuuji@0
|
806
|
yuuji@0
|
807 Phone: +1.408.246.8253
|
yuuji@0
|
808 EMail: dcrocker@bbiw.net
|
yuuji@0
|
809
|
yuuji@0
|
810
|
yuuji@0
|
811 Paul Overell
|
yuuji@0
|
812 THUS plc.
|
yuuji@0
|
813 1/2 Berkeley Square,
|
yuuji@0
|
814 99 Berkeley Street
|
yuuji@0
|
815 Glasgow G3 7HR
|
yuuji@0
|
816 UK
|
yuuji@0
|
817
|
yuuji@0
|
818 EMail: paul.overell@thus.net
|
yuuji@0
|
819
|
yuuji@0
|
820
|
yuuji@0
|
821
|
yuuji@0
|
822
|
yuuji@0
|
823
|
yuuji@0
|
824
|
yuuji@0
|
825
|
yuuji@0
|
826
|
yuuji@0
|
827
|
yuuji@0
|
828
|
yuuji@0
|
829
|
yuuji@0
|
830
|
yuuji@0
|
831
|
yuuji@0
|
832
|
yuuji@0
|
833
|
yuuji@0
|
834
|
yuuji@0
|
835
|
yuuji@0
|
836
|
yuuji@0
|
837
|
yuuji@0
|
838
|
yuuji@0
|
839
|
yuuji@0
|
840
|
yuuji@0
|
841
|
yuuji@0
|
842 Crocker & Overell Standards Track [Page 15]
|
yuuji@0
|
843
|
yuuji@0
|
844 RFC 5234 ABNF January 2008
|
yuuji@0
|
845
|
yuuji@0
|
846
|
yuuji@0
|
847 Full Copyright Statement
|
yuuji@0
|
848
|
yuuji@0
|
849 Copyright (C) The IETF Trust (2008).
|
yuuji@0
|
850
|
yuuji@0
|
851 This document is subject to the rights, licenses and restrictions
|
yuuji@0
|
852 contained in BCP 78, and except as set forth therein, the authors
|
yuuji@0
|
853 retain all their rights.
|
yuuji@0
|
854
|
yuuji@0
|
855 This document and the information contained herein are provided on an
|
yuuji@0
|
856 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
|
yuuji@0
|
857 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
|
yuuji@0
|
858 THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
|
yuuji@0
|
859 OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
|
yuuji@0
|
860 THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
|
yuuji@0
|
861 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
|
yuuji@0
|
862
|
yuuji@0
|
863 Intellectual Property
|
yuuji@0
|
864
|
yuuji@0
|
865 The IETF takes no position regarding the validity or scope of any
|
yuuji@0
|
866 Intellectual Property Rights or other rights that might be claimed to
|
yuuji@0
|
867 pertain to the implementation or use of the technology described in
|
yuuji@0
|
868 this document or the extent to which any license under such rights
|
yuuji@0
|
869 might or might not be available; nor does it represent that it has
|
yuuji@0
|
870 made any independent effort to identify any such rights. Information
|
yuuji@0
|
871 on the procedures with respect to rights in RFC documents can be
|
yuuji@0
|
872 found in BCP 78 and BCP 79.
|
yuuji@0
|
873
|
yuuji@0
|
874 Copies of IPR disclosures made to the IETF Secretariat and any
|
yuuji@0
|
875 assurances of licenses to be made available, or the result of an
|
yuuji@0
|
876 attempt made to obtain a general license or permission for the use of
|
yuuji@0
|
877 such proprietary rights by implementers or users of this
|
yuuji@0
|
878 specification can be obtained from the IETF on-line IPR repository at
|
yuuji@0
|
879 http://www.ietf.org/ipr.
|
yuuji@0
|
880
|
yuuji@0
|
881 The IETF invites any interested party to bring to its attention any
|
yuuji@0
|
882 copyrights, patents or patent applications, or other proprietary
|
yuuji@0
|
883 rights that may cover technology that may be required to implement
|
yuuji@0
|
884 this standard. Please address the information to the IETF at
|
yuuji@0
|
885 ietf-ipr@ietf.org.
|
yuuji@0
|
886
|
yuuji@0
|
887
|
yuuji@0
|
888
|
yuuji@0
|
889
|
yuuji@0
|
890
|
yuuji@0
|
891
|
yuuji@0
|
892
|
yuuji@0
|
893
|
yuuji@0
|
894
|
yuuji@0
|
895
|
yuuji@0
|
896
|
yuuji@0
|
897
|
yuuji@0
|
898 Crocker & Overell Standards Track [Page 16]
|
yuuji@0
|
899
|