Parsing flat files can be quite annoying, as evidenced in the NEXUS format in Bioinformatics. Regular expressions help and here is a fun one that can be used to split a string on the ';' character, but not when it is enclosed in [] (which is a comment in NEXUS).
(.*?)(?<!\[[^\]]*?);
This can be the basis of any pattern that needs to skip over comments, although I have not tested it with nested comments. Thankfully, that was beyond the scope of what I needed.
09 June 2010
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment