Skip to content

Instantly share code, notes, and snippets.

Created August 10, 2013 03:58
Show Gist options
  • Save edubkendo/6198986 to your computer and use it in GitHub Desktop.
Save edubkendo/6198986 to your computer and use it in GitHub Desktop.
A better ruby syntax highlighter for sublime text. Combines the ruby bundle with ST, recent updates to the textmate bundle, and a tmLanguage file called "Experimental Ruby".
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0//EN" "">
<plist version="1.0">
TODO: unresolved issues
"p &lt;&lt; end
print me!
not recognized as a heredoc
there is no way to distinguish perfectly between the &lt;&lt; operator and the start
of a heredoc. Currently, we require assignment to recognize a heredoc. More
refinement is possible.
• Heredocs with indented terminators (&lt;&lt;-) are always distinguishable, however.
• Nested heredocs are not really supportable at present
print &lt;&lt;-'THERE'
This is single quoted.
The above used #{}
From Programming Ruby p306; should be a non-interpolated heredoc.
'\332' is not recognized as slash3.. which should be octal 332.
plain regexp.. should be easy.
':p' is recognized as a symbol.. its 2 things ':' and 'p'.
:'b' has same problem.
ternary operator rule, precedOence stuff, symbol rule.
but also consider 'a.b?(:c)' ??
|( "(\\.|[^"])*+" # eat a double quoted string
| '(\\.|[^'])*+' # eat a single quoted string
| [^#"'] # eat all but comments and strings
( \s (do|begin|case)
| (?&lt;!\$)[-+=&amp;|*/~%^&lt;&gt;~] \s*+ (if|unless)
(?! [^;]*+ ; .*? \bend\b )
|( "(\\.|[^"])*+" # eat a double quoted string
| '(\\.|[^'])*+' # eat a single quoted string
| [^#"'] # eat all but comments and strings
( \{ (?! [^}]*+ \} )
| \[ (?! [^\]]*+ \] )
| [#] .*? \(fold\) \s*+ $ # Sune’s special marker
( (^|;) \s*+ end \s*+ ([#].*)? $
| (^|;) \s*+ end \. .* $
| ^ \s*+ [}\]] ,? \s*+ ([#].*)? $
| [#] .*? \(end\) \s*+ $ # Sune’s special marker
| ^=end
<string>else if is a common mistake carried over from other languages. it works if you put in a second end, but it’s never what you want.</string>
<string>symbols as hash key (1.9 syntax)</string>
<string>symbols as hash key (1.8 syntax)</string>
<string>everything being a reserved word, not a value and needing a 'end' is a..</string>
<string>contextual smart pair support for block parameters</string>
<string>contextual smart pair support</string>
<string> as above, just doesn't need a 'end' and does a logic operation</string>
<string> just as above but being not a logical operation</string>
<string> everything being a method but having a special function is a..</string>
(?=def\b) # an optimization to help Oniguruma fail fast
(?&lt;=^|\s)(def)\s+ # the def keyword
( (?&gt;[a-zA-Z_]\w*(?&gt;\.|::))? # a method name prefix
(?&gt;[a-zA-Z_]\w*(?&gt;[?!]|=(?!&gt;))? # the method name
|===?|&gt;[&gt;=]?|&lt;=&gt;|&lt;[&lt;=]?|[%&amp;`/\|]|\*\*?|=?~|[-+]@?|\[\]=?) ) # …or an operator method
\s*(\() # the openning parenthesis for arguments
<string>the method pattern comes from the symbol pattern, see there for a explaination</string>
(?=def\b) # an optimization to help Oniguruma fail fast
(?&lt;=^|\s)(def)\s+ # the def keyword
( (?&gt;[a-zA-Z_]\w*(?&gt;\.|::))? # a method name prefix
(?&gt;[a-zA-Z_]\w*(?&gt;[?!]|=(?!&gt;))? # the method name
|===?|&gt;[&gt;=]?|&lt;=&gt;|&lt;[&lt;=]?|[%&amp;`/\|]|\*\*?|=?~|[-+]@?|\[\]=?) ) # …or an operator method
[ \t] # the space separating the arguments
(?=[ \t]*[^\s#;]) # make sure arguments and not a comment follow
<string>same as the previous rule, but without parentheses around the arguments</string>
<string> the optional name is just to catch the def also without a method-name</string>
(?=def\b) # an optimization to help Oniguruma fail fast
(?&lt;=^|\s)(def)\b # the def keyword
( \s+ # an optional group of whitespace followed by…
( (?&gt;[a-zA-Z_]\w*(?&gt;\.|::))? # a method name prefix
(?&gt;[a-zA-Z_]\w*(?&gt;[?!]|=(?!&gt;))? # the method name
|===?|&gt;[&gt;=]?|&lt;=&gt;|&lt;[&lt;=]?|[%&amp;`/\|]|\*\*?|=?~|[-+]@?|\[\]=?) ) )? # …or an operator method
<string>Needs higher precidence than regular expressions.</string>
<string>single quoted string (does not allow interpolation)</string>
<string>double quoted string (allows for interpolation)</string>
<string>execute string (allows for interpolation)</string>
<string>execute string (allow for interpolation)</string>
<string>execute string (allow for interpolation)</string>
<string>execute string (allow for interpolation)</string>
<string>execute string (allow for interpolation)</string>
<string>execute string (allow for interpolation)</string>
^ # beginning of line
| (?&lt;= # or look-behind on:
| [\s;]if\s # keywords
| [\s;]elsif\s
| [\s;]while\s
| [\s;]unless\s
| [\s;]when\s
| [\s;]assert_match\s
| [\s;]or\s # boolean opperators
| [\s;]and\s
| [\s;]not\s
| [\s.]index\s # methods
| [\s.]scan\s
| [\s.]sub\s
| [\s.]sub!\s
| [\s.]gsub\s
| [\s.]gsub!\s
| [\s.]match\s
| (?&lt;= # or a look-behind with line anchor:
^when\s # duplication necessary due to limits of regex
| ^if\s
| ^elsif\s
| ^while\s
| ^unless\s
<string>regular expressions (normal)
we only start a regexp if the character before it (excluding whitespace)
is what we think is before a regexp
<string>regular expressions (literal)</string>
<string>regular expressions (literal)</string>
<string>regular expressions (literal)</string>
<string>regular expressions (literal)</string>
<string>regular expressions (literal)</string>
<string>literal capable of interpolation ()</string>
<string>literal capable of interpolation []</string>
<string>literal capable of interpolation &lt;&gt;</string>
<string>literal capable of interpolation -- {}</string>
<string>literal capable of interpolation -- wildcard</string>
<string>literal capable of interpolation -- wildcard</string>
<string>literal incapable of interpolation -- ()</string>
<string>literal incapable of interpolation -- &lt;&gt;</string>
<string>literal incapable of interpolation -- []</string>
<string>literal incapable of interpolation -- {}</string>
<string>literal incapable of interpolation -- wildcard</string>
<string>Cant be named because its not neccesarily an escape.</string>
<string>multiline comments</string>
<string>(^[ \t]+)?(?=#)</string>
matches questionmark-letters.
examples (1st alternation = hex):
?\x1 ?\x61
examples (2nd alternation = octal):
?\0 ?\07 ?\017
examples (3rd alternation = escaped):
?\n ?\b
examples (4th alternation = meta-ctrl):
?\C-a ?\M-a ?\C-\M-\C-\M-a
examples (4th alternation = normal):
?a ?A ?0
?* ?" ?(
?. ?#
the negative lookbehind prevents against matching
<string>__END__ marker</string>
<string>(?=&lt;?xml|&lt;(?i:html\b)|!DOCTYPE (?i:html\b))</string>
<string>Heredoc with embedded html</string>
<string>Heredoc with embedded sql</string>
<string>Heredoc with embedded css</string>
<string>Heredoc with embedded c++</string>
<string>Heredoc with embedded c</string>
<string>Heredoc with embedded javascript</string>
<string>Heredoc with embedded jQuery javascript</string>
<string>Heredoc with embedded shell</string>
<string>Heredoc with embedded lua</string>
<string>Heredoc with embedded ruby</string>
<string>heredoc with indented terminator</string>
<string>&lt;=&gt;|&lt;(?!&lt;|=)|&gt;(?!&lt;|=|&gt;)|&lt;=|&gt;=|===|==|=~|!=|!~|(?&lt;=[ \t])\?</string>
<string>(?&lt;=[ \t])!+|\bnot\b|&amp;&amp;|\band\b|\|\||\bor\b|\^</string>
<string>This is kindof experimental. There really is no way to perfectly match all regular variables, but you can pretty well assume that any normal word in certain curcumstances that havnt already been scoped as something else are probably variables, and the advantages beat the potential errors</string>
<string>(?&lt;=^|\s)(#)\s(?=[[a-zA-Z0-9,. \t?!-][^\x{00}-\x{7F}]]*$)</string>
<string>We are restrictive in what we allow to go after the comment character to avoid false positives, since the availability of comments depend on regexp flags.</string>
<string>^(?=(\t| ))</string>
<string>( )( )?( )?( )?( )?( )?( )?( )?( )?( )?( )?</string>
Copy link

renanra commented Oct 20, 2013

Great! It just does the trick.

Copy link

MattDMo commented Dec 2, 2013

Have you considered putting this on Package Control? If so, I have a couple of suggestions:

  1. Don't name it RubyNext.tmLanguage, just call it RubyNext or something similar. Sublime may choke if a directory name is the same as a file name.
  2. Feel free (if you haven't already, I didn't check) to include and improve on my revisions adding support for the %i[foo bar] # [:foo :bar] symbol array literal notation.
  3. In order for this to coexist with the existing Ruby.tmLanguage, you'll need to change the UUID. Here's one, free of charge 😀 : BF82A477-650C-46A4-9422-6CE8D9636E97
  4. Advertise heavily on the Sublime forum and you'll likely get lots of suggestions and (hopefully) contributions. I published Python Improved a month or two ago, and have been amazed at the feedback - people really are looking for up-to-date versions of their favorite language definitions, and with the popularity of Ruby and Rails I'm sure you'll get a ton of installs.

Good luck!

Copy link

The discrepancy between the actual Base 16 syntax highlighting and the preview was frustrating. Thank you!

Copy link

shunwen commented May 6, 2014

Typo in line 45 "precedOence"?

Copy link

Noob here. How do I install this?

Copy link

To install this on a Mac...

  1. Download the "RubyNext.tmLanguage" file
  2. Open Sublime Text 3
  3. Click Preferences > Browse Packages...
  4. Put the "RubyNext.tmLanguage" file into "User" Directory
  5. Click View > Syntax Highlighting > "Open All With Current Extension as..." > Better Ruby

I don't know why it's not called "RubyNext"... i just renamed mine so that i would find it more easily (who would've thought to look in the "B"s for this?)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment