SANS Software Security Institute
SANS Software Security Institute

Home > Courses > Secure Coding

Register For
Upcoming Events

No Events Scheduled

Security 651 ::

Exploiting Regular Expressions to Process Text

Overview

What are Regular Expressions?

Regular Expressions, also known as RegEx, are a compact way of describing complex patterns in text. RegEx patterns can be used to find, replace, edit, and filter text in files and databases. As an IT professional you may already know some RegEx. If you're like most of us, you probably dread RegEx but you also know that it's something you need to learn. This is your chance to improve your RegEx skills with the SANS RegEx course.

If you've ever had to do any of the following, this class is for you:

  • Search through log files for a particular error message.
  • Replace a string of text in one or more documents.
  • Clean up a database that contains irregular data.
  • Find out exactly when a security breach occurred.
  • Clean up and/or convert HTML, XML, or XHTML.
  • Filter out certain types of data from your server logs.
Why is RegEx important?
  • It saves time and money. Long and tedious editing projects can be completed quickly with short, concise RegEx statements.
  • Regular Expressions are everywhere. Popular commands and programs like grep, egrep, vi, sed, lex, awk, emacs, Perl, Tcl, Java, PHP, Python, and Windows findstr support regular expressions.
  • RegEx knowledge is a useful skill you can use everyday to make your job easier.
  • RegEx can make your editing more accurate. Processing files by hand is slow, tedious, and prone to error.
  • Log files and electronic records continue to grow larger and more complex. We are now under more stringent electronic discovery law. You could be the hero of your organization if you have the ability to find the missing records.
  • If you learn RegEx, that bully at the beach will no longer kick sand in your face. Well, this might be stretching the truth a bit, but at least you can impress your co-workers with your RegEx skills.
Here are just a few of the skills you'll learn in this class:
  • Match exactly this literal character sequence.
  • Match exactly one character (any character, not a particular one)
  • Match any one of this list of characters
  • Match any one of this list of sub-expressions
  • Match any digit
  • Match any white space
  • Match this sub-expression only at the {start, end} of a line
  • These and many more...

Laptop

Laptop Required

The laptop should be running Linux, UNIX, or Cygwin (or Windows versions of the grep, egrep, and sed tools). Two files will be used for the exercises. They can be downloaded from the web or from a fellow classmate's thumb drive. There are no special speed, capacity, or peripheral requirements other than the ability to download the practice files from some source.