Skip to content

Instantly share code, notes, and snippets.

@egmontkob
Last active September 17, 2024 13:27
Show Gist options
  • Save egmontkob/eb114294efbcd5adb1944c9f3cb5feda to your computer and use it in GitHub Desktop.
Save egmontkob/eb114294efbcd5adb1944c9f3cb5feda to your computer and use it in GitHub Desktop.
Hyperlinks in Terminal Emulators
@arvenil
Copy link

arvenil commented Jul 18, 2023

Is there any way to detect if terminal supports this feature?

@AnonymouX47
Copy link

AnonymouX47 commented Jul 18, 2023

@arvenil, it's answered in the document.

@alvaromuir
Copy link

alvaromuir commented Aug 1, 2023

for the life of me i can't get this to work when trying to replace a value with JQ.
I wouldv'e thought this was pretty straight forward:
echo '{"link":"\x1b]8;;https://google.com\x1b\\google.com\x1b]8;;\x1b]8;;\x1b\"}' | jq

@jamie-pate
Copy link

jamie-pate commented Sep 6, 2023

Is there any way to detect if terminal supports this feature?

No, and no standard way to disable it, so any program that adds these links ends up spamming escape characters all over your output (edit: if they are not 100% compatible with all forms of escape codes)

]8;id=274247;https://ansible-lint.readthedocs.io/rules/syntax-check/\syntax-check[specific]]8;;\: Invalid options for ansible.builtin.include_role: vars

https://gist.github.com/egmontkob/eb114294efbcd5adb1944c9f3cb5feda#backward-compatibility mentions that any terminal emulator that doesn't parse it correctly should consider this a bug since it doesn't properly follow ECMA-48

@AnonymouX47
Copy link

AnonymouX47 commented Sep 6, 2023

Hey @jamie-pate.

That should be considered a bug on the part of the program (I believe ansible-lint in your case) emitting the sequence, you should report the issue over there.

As Egmont explained, it's the duty of the program emitting the sequence to detect if its output stream is connected to a terminal device before emitting such sequences... The same applies to almost any other terminal control sequence, not just this.

Next time, kindly ask nicely about things you may not be well-informed about. 😃

EDIT: For the record, I saw your previous comment.

@egmontkob
Copy link
Author

Oh, and just for the fun of it (in response to a freshly deleted comment that's archived on web.archive.org)

feels like this whole thing put the horse before the cart

The last time I checked, that's where the horse belongs 😂

@jamie-pate
Copy link

https://gist.github.com/egmontkob/eb114294efbcd5adb1944c9f3cb5feda#backward-compatibility mentions that any terminal emulator that doesn't parse it correctly should consider this a bug since it doesn't properly follow ECMA-48

I was trying to be constructive, but to clarify, I mean that the terminal emulator is faulty if it parses colors and other ECMA-48 codes, but fails to properly deal with ]8

Trying to help people identify where to report the issues with various pieces of infrastructure that this is exposing.. e.g. concourse/concourse#4318

@mintty
Copy link

mintty commented Sep 6, 2023

I mean that the terminal emulator is faulty if it parses colors and other ECMA-48 codes, but fails to properly deal with ]8

Isn't that just what the quoted section is saying?

@jamie-pate
Copy link

jamie-pate commented Sep 6, 2023

I mean that the terminal emulator is faulty if it parses colors and other ECMA-48 codes, but fails to properly deal with ]8

Isn't that just what the quoted section is saying?

I was responding to this comment, which seems to be laying the blame on the emitter (Originating devices as per ECMA-48), not the consumer (Receiving devices as per ECMA-48):

Hey @jamie-pate.

That should be considered a bug on the part of the program (I believe ansible-lint in your case) emitting the sequence, you should report the issue over there.

My conclusion is that according to #backward-compatibility the Receiving device is 100% responsible for implementing the rest of ECMA-48 even if they just wanted fancy color support and implemented it by scanning through the wikipedia article on the subject. Therefore, if you have issues, you should report issues with the Receiving device implementer.

@AnonymouX47
Copy link

@jamie-pate Just to be clear, where exactly did you copy the following from?

]8;id=274247;https://ansible-lint.readthedocs.io/rules/syntax-check/\syntax-check[specific]]8;;\: Invalid options for ansible.builtin.include_role: vars

@jamie-pate
Copy link

jamie-pate commented Sep 6, 2023

The program generating the output is ansible-lint which creates output using the rich python library

The issue shows up when running inside concourse ci

@PerBothner
Copy link

Whether or not concourse-ci renders OSC 8 links as links is optional: nice but not required.
However, it should parse OSC escape sequences and ignore ones it doesn't handle. That is a pretty basic requirement for any kind of modern terminal emulator (or wrapper like screen/tmux) that claims to be more-or-less-xterm-compatible (which almost all do). If concourse-ci only claims to implement a minimal ansi/vt-NNN-style terminal, then ansible-lint/rich should not be emitting any OSC sequences.

@AnonymouX47
Copy link

AnonymouX47 commented Sep 6, 2023

The program generating the output is ansible-lint which creates output using the rich python library

The issue shows up when running inside concourse ci

I see.

Firstly, I believe @PerBothner has given a good reply (possibly with the exception of the last sentence).

In addition... as far as I know, rich does the necessary detection. So, the issue seems to lie on the end of concourse-ci. If concourse-ci "tells" the programs it executes that their output is a terminal device, then it should gracefully behave as one.

@jamie-pate
Copy link

jamie-pate commented Sep 6, 2023

Concourse doesn't advertise any virtual terminal support, but does support 'ANSI' color sequences ^[.... but not ^]8...

Many concourse examples add TERM=xterm-color to the environment, which advertises full support of the spec..

By removing the falsely advertised terminal advertisement from my config prevents the links, but I also lose all color.

@PerBothner
Copy link

If rich sees TERM=xterm-color I think it is reasonable for it to assume it is safe to emit OSC escape sequences, especially the more common ones.

So either fix/enhance concourse to ignore OSC escape sequences (probably not that difficult if it already handles ANSI color sequences). Or change TERM to something closer to what concourse supports, such as TERM=ansi. (I don't know if TERM=ansi will allow colors, but you should be able to find something that works.) And Concourse examples should be fixed to not use TERM=xterm-color as that is too much of a lie.

@jamie-pate
Copy link

jamie-pate commented Sep 6, 2023

I don't know if TERM=ansi will allow colors

Unfortunately, this doesn't seem to be possible.

The issue is that 'supports colors' has been the dominating aspect of terminfo for so long that every library that sniffs for terminal capabilities will only check if it 'supports colors' and then give up if it doesn't. Other Control Function escape sequences have not been on the radar for quite a while for this class of non-interactive program. (edit: see this relevant code from the rich library as an example.) (ncurses-alike libraries will need more capabilities)

I agree the best way forward is that concourse's elm-ansi should be updated. This leaves me currently with the task of stripping unsupported sequences using sed and that is fine.

(edit: Actually, concourse+rich still guesses 'standardcolor' without TERM)

@stuaxo
Copy link

stuaxo commented Sep 7, 2023

Probably worth opening a ticket on concourse ci for osc8 support since it's open source.

@denolfe
Copy link

denolfe commented May 9, 2024

Thanks for this! I was able to use this to make hyperlinks in my p10k prompt segments!

CleanShot 2024-05-09 at 15 59 28
  # Shows the PR number as hyperlink
  prompt_pr_number() {
    if [[ ! -d .git ]]; then return; fi

    local pr_number=$(git config --get branch."$(git branch --show-current)".github-pr-owner-number | awk -F "#" '{print $3}')

    if [ -z "$pr_number" ]; then return; fi

    local pr_link=$(echo "\e]8;;https://github.com/payloadcms/payload/pull/$pr_number\e\\#$pr_number\e]8;;\e\\")
    _p9k_prompt_segment "$0$state" 208 016 '' 0 '' "$pr_link"
  }

@vin01
Copy link

vin01 commented May 21, 2024

thanks for maintaining this compilation of useful resources.

Some locally installed applications might register a handle for some custom URI scheme (e.g. foobar://), and the handler application might be vulnerable in case the rest of the URI is maliciously crafted. Terminal emulators might decide to whitelist only some well known schemes and ask for the user's confirmation on less known ones.

I assessed this for iTerm2 and Hyper and just published: https://vin01.github.io/piptagole/escape-sequences/iterm2/hyper/url-handlers/code-execution/2024/05/21/arbitrary-url-schemes-terminal-emulators.html (Abusing url handling in iTerm2 and Hyper for code execution)

If terminal emulators themselves act as applications handling arbitrary URL schemes, attack surface can be quite broad.

@hybridgorilla897
Copy link

If you want to skip the convoluted docs and just want to cut to the chase, here is a

Python example

def terminal_link(url, text):
	return '\033]8;;' + url + '\033\\' + text + '\033]8;;\033\\'

print('-->', terminal_link('https://google.com', 'Click here to open Google'), '<--')
print('-->', terminal_link('file:///etc/passwd', 'Click here to open /etc/passwd'), '<--')

@AnonymouX47
Copy link

@hybridgorilla897, that's such a naive and mediocre mindset that has lead to a lot of poor and low-quality projects/products all over the place. You can do better.

Sidenote You seem to have joined GitHub just about an hour before posting this comment, that's crazy though 🤔. Not that it means anything, just interesting. Never had the priviledge of seeing such a fresh user on here.

@hybridgorilla897
Copy link

@AnonymouX47 I use throwaway accounts all the time. My main account is from 2010.

This problem is realistically not something anyone should spend more than 30 seconds on.

@AnonymouX47
Copy link

AnonymouX47 commented Jun 18, 2024

This problem is realistically not something anyone should spend more than 30 seconds on.

Well... until someone runs into some issue and blames some innocent TE devs for their own negligence and incompetence or begin to ask unnecessary questions e.g see:

@Explosion-Scratch
Copy link

JavaScript implementation:

const OSC = "\u001B]";
const SEP = ";";
const BEL = "\u0007";
const link = (text, url) =>
  [OSC, "8", SEP, SEP, url, BEL, text, OSC, "8", SEP, SEP, BEL].join("");

@barneygale
Copy link

On emitting file: URIs: why should every application be required to query and reproduce the FQDN for local links? What if the application doesn't/can't know the FQDN?

This problem doesn't arise in web browsers: I can write <a href="/foo.html">bar</a> rather than <a href="https://example.com/foo.html">bar</a>, and it works because the webbrowser is aware of the current domain. That could be replicated in a terminal by making ssh emit a control sequence indicating "now connected to machine example.com", with a corresponding operation to pop the current machine when the connection is closed. The terminal emulator can check the connection stack to see if/how to open a file: URI.

@egmontkob
Copy link
Author

egmontkob commented Sep 3, 2024

I think the document is clear on this: there's no FQDN involved.

The hostname, as a host calls itself, is involved. It's typically not fully qualified (although on some systems it is). It's as easy to get it as a gethostname() or uname() system call, or accessing the HOSTNAME environment variable, or executing the hostname utility. Surely something similarly simple is available on Windows, too.

What do you mean an application doesn't/can't know this, how could that happen??? If an application is written in a language that doesn't give you access to any of these then get in touch with that language's developers. If it's running in a sandbox and access to these is deliberately denied then presumably so is access to the host's files, therefore local links emitted wouldn't make sense either.

FQDN, on the other hand, goes hand in hand with DNS resolution, and is for addressing some other host on the network. Possibly multiple hosts via the same FQDN, due to load balancing. And a host can have a plethora of FQDNs that resolve to that particular host (or multiple hosts including that one). The question "what is the FQDN of this host" doesn't really make sense. And, luckily, it's fully omitted from this OSC 8 game.


Rather than the stateless design we went with, the stateful design that you propose would have the following properties:

Would need buy-in from OpenSSH developers. Or rather: would have needed buy-in from its developers when this OSC 8 feature was in design phase only, not knowing if it would become successful. Would have they agreed to go with it? Maybe, maybe not. We can't tell.

As far as I know, OpenSSH has absolutely nothing to do with terminal emulation, it doesn't know about any escape sequence. [It can allocate a tty line, and it can filter out unprintable characters from some messages (which it did not do yet when OSC 8 was designed, this security hole had not been discovered yet back then), but that's pretty much all about it.] So if I'm not mistaken, this would have been the first escape sequence it knows about and emits. Maybe just a few lines of code, but a significant change in its project scope. Then would it go into the business of terminal capabilities identification (terminfo and friends) to know if the terminal is supposed to support the said escape sequences? (Mind you, identifying this feature isn't addressed yet, in the current specs – long story for another day.) What if it gets it wrong one way? You get faulty links, pointing to a file of the same name on a different machine. What if it gets it wrong the other way, or if it leaves out this component and unconditionally emits the escape sequences? Chances of seeing garbage in non-supporting terminals. Now, surely, there's a chance currently that OSC 8 links result in garbage (there's a section dedicated to this in the doc), but with your proposed design this could also happen if somebody doesn't wish to use any OSC 8 at all.

Handle if the connection ends abruptly. I guess emitting the escape sequence could go to the ssh client and then it could protect against that. Handle local job control (^Z, fg, bg).

What to do if for whatever reason the stack gets corrupted? E.g. cat'ing a binary file accidentally prints one escape sequence that pushes to or pops from this stack? You'd get stuck with that broken behavior probably til the end of that terminal session.

What to do on reset? On one hand its job is, well, to reset things. On the other hand, if executed within an ssh session, you don't want to break the rest of OSC 8's within that session.

Let's suppose we would've gotten buy-in from OpenSSH. What about all the other ssh implementations out there? What about other commands and protocols that let you connect to a remote host? rsh, telnet, lxc exec, and a whole bunch of others, including commercial software too? Do you see a reasonable chance of convincing all of them to support OSC 8 hostname pushing/popping? Because wherever it's missing, you get faulty links.

Alternatively, you might say that it should be the remote shell's job to push/pop. But it's again highly problematic. How long until all systems set up the shell to emit this sequence by default? Decades at least. What if the user executes a specific command rather than the shell? It's skipped then, printing faulty links. What if the connection breaks? Popping is skipped, corrupting the rest of the terminal session.


The stateful design that you propose would be extremely unreliable in many situations where the current stateless design is robust, reliable.

Accessing the hostname, which you have a problem with in the current design, is not a problem at all.


And finally, a friendly reminder, to everyone having similar thoughts as you that the design should have been different:

The protocol was designed 7.5 years ago. It was discussed in public bugtrackers; anyone truly interested in all the innovations and ongoing work in popular terminal emulators did have a chance to notice it and join the conversation. That ship has sailed. Sailed a long-long time ago. Even if your proposal would be better (which I firmly believe it isn't), we couldn't just redesign the protocol in a backward incompatible way and convince everyone who already adopted OSC 8 to rework it.

@cben
Copy link

cben commented Sep 3, 2024

Additionally, if you don't care about link robustness over ssh (or other ways to save output to file and replay it out of context / on other machines), AND have trouble obtaining the hostname, note the spec permits simply putting localhost there. Suboptimal but works fine for single-machine use cases.

Web browsers are not a fair comparison — they know exactly where a file starts/ends, so "base url", whether default or overridden are precisely scoped. Terminals have much fuzzier, flat, best-effort, idea of scopes (and can always be mucked up by cat untrusted_file.txt); any "action-at-a-distance" escape sequence interactions are fragile.

@barneygale
Copy link

Thanks for the explanations.

Another Q: how can a Windows machine declare a link to a UNC share, like //server/share/file.txt? Would the URI be file://server/share/file.txt or something else?

@egmontkob
Copy link
Author

I cannot answer this question because I'm really not familiar with Windows systems, sorry. Basically OSC 8 was designed to require a URI. So however that UNC share translates to a URI in other contexts as well (e.g. what do you type into a browser's address bar, a HTML page's A HREF tag?), or if they don't then it's up to Windows-savvy developers to extend the protocol to allow UNC names too (assuming that those two namespaces don't conflict). Or maybe it needs its own scheme, like unc://..., is that a thing? I don't know.

(cben thanks a lot for you amending my answer, you're absolutely right on both points, and I completely forgot about the possibility of using localhost or leaving the host empty :))

@barneygale
Copy link

rfc8089 covers some of the possibilities in its appendices:

  1. file://server/share/file.txt <-- the most common syntax; server is used as authority
  2. file:////server/share/file.txt <-- older syntax, empty authority
  3. file://///server/share/file.txt <-- variation with an additional / in the path, to match URIs like file:///c:/foo

None of them encode the hostname of the machine that generated the link, and this is one of the reasons why it's uncommon to see hostnames in file: URIs outside of UNC paths. To include the local hostname I guess we'd need something like file://myhost//server/share/file.txt, or perhaps with a triple-slash after myhost. I'll have a look to see if any existing implementations support this syntax.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment