Java 정규식

http://java.sun.com/javase/6/docs/api/java/util/regex/Pattern.html

Java and Regular Expressions - Tutorial

 

HTML추출

Pattern p = Pattern.compile("\\<(\\/?)(\\w+)*([^<>]*)>");
Matcher m = p.matcher(body);

body = m.replaceAll("");

 

String content = str.replaceAll("<(/)?([a-zA-Z]*)(\\s[a-zA-Z]*=[^>]*)?(\\s)*(/)?>", "");

                       str.replaceAll("(?:<!.*?(?:--.*?--\\s*)*.*?>)|(?:<(?:[^>'\"]*|\".*?\"|'.*?')+>)","");

 

그림파일 추출

String source = "<img src=\"
String pattern = "
http://.(.jpg|.gif)";
Pattern p = Pattern.compile(pattern);
Matcher m = p.matcher(source);
System.out.println();
while(m.find()) System.out.println(m.group());