Better regex for md_browser.py

This improves the autolinkifier regex used by md_browser.py. The old
regex would linkify URLs until the end of the line, rather than the
correct behavior of linkifying only until the end of the URL.

This also drops support for non-lowercase spellings of "HTTP", since it
seems the gitiles implementation (linked in a code comment) doesn't seem
to support that anyway.

Lastly, this updates the code comment to the new path for the gitiles
parser code.

Bug: 968865
Test: tools/md_browser/md_browser.py android_webview/docs/README.md
Test: tools/md_browser/md_browser.py android_webview/docs/quick-start.md
Test: tools/md_browser/md_browser.py android_webview/docs/webview-shell.md
Test: tools/md_browser/md_browser.py android_webview/docs/test-instructions.md
Test: On all the above, observe all autolinks work as intended
Change-Id: I1b5c6b2496ef64b1da2aa00d11b340c40761c3f5
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/1638123
Commit-Queue: Nate Fischer <[email protected]>
Reviewed-by: Dirk Pranke <[email protected]>
Cr-Commit-Position: refs/heads/master@{#665380}
diff --git a/tools/md_browser/gitiles_autolink.py b/tools/md_browser/gitiles_autolink.py
index 5cfeb213..eb77ebb51 100644
--- a/tools/md_browser/gitiles_autolink.py
+++ b/tools/md_browser/gitiles_autolink.py
@@ -6,14 +6,20 @@
 
 This extention auto links basic URLs that aren't bracketed by <...>.
 
-https://gerrit.googlesource.com/gitiles/+/master/gitiles-servlet/src/main/java/com/google/gitiles/Linkifier.java
+https://gerrit.googlesource.com/gitiles/+/master/java/com/google/gitiles/Linkifier.java
 """
 
 from markdown.inlinepatterns import (AutolinkPattern, Pattern)
 from markdown.extensions import Extension
 
 
-AUTOLINK_RE = r'([Hh][Tt][Tt][Pp][Ss]?://[^>]*)'
+# Best effort attempt to match URLs without matching past the end of the URL.
+# The first "[]" is copied from Linkifier.java (safe, reserved, and unsafe
+# characters). The second "[]" is similar to the first, but with English
+# punctuation removed, since the gitiles parser treats these as punction in the
+# sentence, rather than the final character of the URL.
+AUTOLINK_RE = (r'(https?://[a-zA-Z0-9$_.+!*\',%;:@=?#/~<>-]+'
+               r'[a-zA-Z0-9$_+*\'%@=#/~<-])')
 
 
 class _GitilesSmartQuotesExtension(Extension):