AS3: Remove HTML RegEx

I was working on an app and needed to remove all HTML instances in a string. Well, that same functionality was needed elsewhere so…I created a class.

/**
* String Utility functions
* @author John C. Bland II
* @version 0.1
*/
 
package utils{
public class StringUtils {
 
public static function stripHTML(value:String):String{
return value.replace(/<.*?>/g, "");
}
}
}

To use it you simply:

trace(StringUtils.stripHTML("some html string"));

Hope this helps someone.

Matching Start & End Tags

I had a recent problem where I needed to parse a template file and replace the custom tags with different code, etc. Well, we had regex ready to grab a single line tag (ex – <[MyTag]>) then we had it ready to grab a multiline tag (with content between it). The problem came in when I needed to have nested tags. Lemme show a quick example:

bq. <[MyTag]>
some text
maybe some regular html tags
anything you want here
<[SomeOtherTag]>
some content

<[AnotherTag:Singleline]>

Well, the regex we were using would stop at instead of . I tried using lookbacks (or whatever the technical term is) by using \1 but that wasn’t working. Today I was turned on to a few new resources (“RegexAdvice.com”:http://regexadvice.com/ and “Kodos”:http://kodos.sourceforge.net) which helped me shape and mold the regex to a fully functional template parser.

bq. <(\[[a-zA-Z0-9]*\])[^>]*>([\w\W]*?)

I implemented this with C# using the following code:

bq. Regex _contentReg = new Regex(@”<(\[[a-zA-Z0-9]*\])[^>]*>([\w\W]*?)“, RegexOptions.IgnoreCase);
_matchColl = _contentReg.Matches(template);

After this I merely looped over the matches and utilized “Cody’s”:http://blog.xyzpdq.org/ reflection yumminess he had already setup and VOILA! 🙂

Anyways…all that to say…hopefully this regex helps someone else as it took quite some time to find a solution.