Jump to content

Snowball (programming language)

From Wikipedia, the free encyclopedia

Snowball is a small string processing programming language designed for creating stemming algorithms for use in information retrieval.[1]

The name Snowball was chosen as a tribute to the SNOBOL programming language, "with which it shares the concept of string patterns delivering signals that are used to control the flow of the program."[2] The creator of Snowball, Dr. Martin Porter, "toyed with the idea of calling it 'strippergram,'" because it "effectively provides a 'suffix STRIPPER GRAMmar.'"[1]

The Snowball compiler translates a Snowball script (an .sbl file) into program in thread-safe ANSI C, Java, Ada, C#, Go, Javascript, Object Pascal, Python or Rust.[3][4] For ANSI C, each Snowball script produces a program file and corresponding header file (with .c and .h extensions).[3] The Snowball compiler checks the consistency of its script, and this check was used to discover a typo in a seminal academic paper by Dr. Julie Beth Lovins, notable computational linguist and creator of the Lovins Stemming Algorithm, which had remained undetected for 30 years.[5]

The basic datatypes handled by Snowball are strings of characters, signed integers, and boolean truth values, or more simply strings, integers and booleans. Snowball's characters are either 8-bit wide, or 16-bit, depending on the mode of use. In particular, both ASCII and 16-bit Unicode are supported.[2] Like the SNOBOL programming language, the flow of control in Snowball is arranged by the implicit use of signals (each statement returns a true or false value), rather than the explicit use of constructs such as if, then, and break found in C and many other programming languages.[2]

Though the original Snowball website maintained by Dr. Martin Porter and colleague Richard Boulton has been closed since 2014 following Dr. Porter’s retirement,[1][4][6] the site itself is still accessible, and the language continues to be developed as a community project on GitHub.[1][4] Additionally, large projects like the Natural Language Toolkit (NLTK) for Python employ Snowball along with stemming algorithms designed by Dr. Porter and other contributors to the Snowball language.[7][8]

References

[edit]
  1. ^ a b c d "Snowball", Martin Porter, web page. Retrieved 2 September 2014.
  2. ^ a b c "Snowball Manual", Martin Porter, web page. Retrieved 2 September 2014.
  3. ^ a b Porter, Martin. "Snowball: Quick introduction". Retrieved May 4, 2025.
  4. ^ a b c "Snowball README". March 27, 2025. Retrieved May 4, 2025.
  5. ^ Martin Porter (December 2001). "Lovins revisited". snowball.tartarus.org. Retrieved 6 August 2024.
  6. ^ Porter, Martin. "Snowball - Credits". Retrieved May 4, 2025.
  7. ^ "nltk.stem.SnowballStemmer Documentation". Natural Language Toolkit. Retrieved May 4, 2025.
  8. ^ "Source code for nltk.stem.snowball". Natural Language Toolkit. Retrieved May 4, 2025.
[edit]