Exploiting Markdown Syntax
Markdown is wonderful. In fact, this blog post itself is written in Markdown. I don’t need to use lengthy uneccessary HTML for simple things like links, tables,
code blocks and lists. Nor do I need to go out of my way to do simple things with text such as making words bold or italic. In the last five years, Markdown has attained a large amount of attention. It is now used by Reddit, Github, StackOverflow and many more. Additionally, it seems that Markdown is becoming more and more popular in newer applications being pushed out by start ups. Alongside this movement, a large array of Markdown parsing libraries have been released over the years.
Taking a look at the unofficial/general specifications for Markdown, it proposes a wide range of syntax which Markdown provides, and subsequently syntax which is later parsed and converted to HTML by Markdown parsers.
Unfortunately or fortunately (in the way that people look at it), there is no standardisation in the way Markdown should be processed and parsed. This is left entirely to the developer of the Markdown parsing library. As mentioned earlier, due to the fact that there are numerous libraries for such parsing, a very important decision has to be made on sanitisation. This is:
Should the Markdown library do the sanitisation, or should the job of sanitisation not be the library’s responsibility?
It seems however, that most libraries do not provide a solid basis for sanitising user input given to the Markdown parser. Logically, if we think about where Markdown is usually used: i.e. comments and user content, we quickly realise that the lack of sanitisation in Markdown libraries could make developers use Markdown without the awareness of the fact that they still need to sanitise and validate user input on their end manually. For reference, there are some discussions on this topic here: , , .
Furthermore, if you do a simple search on google:
markdown xss issue site:github.com – you get a plethora of valid results. Some repo’s found with such issues range from having as little as 10 stars, to 12 thousand stars.
So, how about exploiting Markdown parsers? No problem.
Please refer to the vectors below:
These vectors were initially developed by my friend Aleksa and myself, in order to check for Markdown related vulnerabilities on services that actively use markdown. Throughout the last 12 months or so, the above Markdown exploitation vectors have proven to be extremely successful in pentests and general bug hunting.
The above vectors target edge cases and a majority of Markdown libraries are STILL vulnerable. For example, let’s take a look at the last vector on that list:
So, let’s imagine that this is the logic of a Markdown parser in attempting to validate a link:
- Does the link have a schema? Yes.
- Does the link end with a known ending (com, org, etc)? Yes.
- Therefore, let’s convert it to HTML!
The above link, on click would execute the payload put after the new line control characters (%0d%0a). We have successfully been able to place our XSS vector in something that looks like a link, but really isn’t. The vector demonstrated above, is most effective on parsers which use relaxed regex’s to verify the validity of a link.
Telescope Persistent XSS through Markdown (CVE-2014-5144)
Telescope is a brilliant open source project, which aims to provide software (built on Meteor.js) to create and run communities similar to that of Reddit and Hackernews. A feature which has been with Telescope for a long time (Markdown parsing for comments and threads), was found to be vulnerable to cross site scripting.
The vectors used for exploiting this vulnerability:
Note: this exploit has been fixed in all versions of Telescope >= 0.9.3.
If you come up with any more vectors for Markdown exploitation, feel free to contribute them to our little evil repo, found here.