5.2 Identifiers and Allowed Characters

Identifiers are names given to entities like variables, functions, types, modules, etc. In Rust:

  1. Allowed Characters: Identifiers must start with a Unicode character belonging to the XID_Start category or an underscore (_). Subsequent characters can be from XID_Start, XID_Continue, or _.
    • XID_Start includes most letters from scripts around the world (Latin, Greek, Cyrillic, Han, etc.).
    • XID_Continue includes XID_Start characters plus digits, underscores, and various combining marks.
    • This means identifiers like привет, 数据, my_variable, _internal, and isValid are valid.
  2. Restrictions:
    • Standard ASCII digits (0-9) cannot be the first character (unless using raw identifiers, e.g., r#1st_variable, which is highly discouraged).
    • Keywords cannot be used as identifiers unless escaped with r#.
    • Spaces, punctuation (like !, ?, ., -), and symbols (like #, @, $) are generally not allowed within identifiers.
  3. Encoding: Identifiers must be valid UTF-8.
  4. Length: No explicit length limit, but overly long identifiers harm readability.

Naming Conventions (Style, Not Enforced by Compiler):

  • snake_case: Used for variable names, function names, module names (e.g., let user_count = 5;, fn calculate_mean() {}, mod network_utils {}).
  • UpperCamelCase: Used for type names (structs, enums, traits) and enum variants (e.g., struct UserAccount {}, enum Status { Connected, Disconnected }, trait Serializable {}).
  • SCREAMING_SNAKE_CASE: Used for constants and statics (e.g., const MAX_CONNECTIONS: u32 = 100;, static DEFAULT_PORT: u16 = 8080;).

These conventions enhance readability and are strongly recommended.