#4320 NORM First D: Need support for hyperlinks

Zarro Boogs per Child bugtracker at laptop.org
Tue Oct 23 22:32:56 EDT 2007


#4320: Need support for hyperlinks
----------------------------+-----------------------------------------------
  Reporter:  Eben           |       Owner:  morgs                 
      Type:  enhancement    |      Status:  reopened              
  Priority:  normal         |   Milestone:  First Deployment, V1.0
 Component:  chat-activity  |     Version:                        
Resolution:                 |    Keywords:                        
  Verified:  0              |  
----------------------------+-----------------------------------------------

Comment(by AlbertCahalan):

 Replying to [comment:6 Eben]:
 > Just a note on the regexp. It doesn't yet handle trailing characters
 such as periods, closing peren, commas, semicolons, etc.  These characters
 are likely part of the syntax of the sentence, and not part of the URL.

 This seems to do the job.

 {{{
 egrep -99 --color
 '((http|ftp)s?://)?(([-a-zA-Z0-9]+[.])+[-a-zA-Z0-9]{2,}|([0-9]{1,3}[.]){3}[0-9]{1,3})(:[1-9][0-9]{0,4})?(/[-a-zA-Z0-9/%~@&_+=;:,.?#]*[a-zA-Z0-9/])?'
 }}}

 There is a tradeoff to be made. In general the above errs on the side of
 choosing something as a URL, so you'd get laptop.org from this sentence.
 You'd not get hello.c, but sugar.py would count. (maybe "py" is a country
 code top level domain) Usernames and passwords built into the URL are not
 supported; they are very rare and often considered to be bad security
 practice. The same goes for unescaped non-ASCII.

 Breaking it down into smallish semi-readable chunks:

 {{{
  optional protocol part (does ftps, not sftp or irc or mailto)
  ((http|ftp)s?://)?

  fully-qualified names and IPv4 addresses are accepted
  (([-a-zA-Z0-9]+[.])+[-a-zA-Z0-9]{2,}|([0-9]{1,3}[.]){3}[0-9]{1,3})

  port numbers are decimal, 1 to 5 digits (accepts 99999 but not 0377)
  (:[1-9][0-9]{0,4})?

  this gets the rest, disallowing some trailing puctuation
  (/[-a-zA-Z0-9/%~@&_+=;:,.?#]*[a-zA-Z0-9/])?'
 }}}

-- 
Ticket URL: <https://dev.laptop.org/ticket/4320#comment:7>
One Laptop Per Child <https://dev.laptop.org>
OLPC bug tracking system



More information about the Bugs mailing list