Skip to content

Latest commit

 

History

History
74 lines (50 loc) · 2.67 KB

Regular Expressions.md

File metadata and controls

74 lines (50 loc) · 2.67 KB

Regular Expressions

Regular expressions, also known as regex, allow us to match and manipulate patterns in strings.

Matching Patterns with Regular Expressions:

Regular expressions provide a way to match specific patterns within strings. They are incredibly useful for tasks such as string validation, text searching, and data extraction. Python's built-in re module allows us to work with regular expressions.

To use regular expressions, we need to import the re module:

import re

Here's an example to demonstrate how to match patterns using regular expressions:

import re

pattern = r"apple"

text = "I love apples"

match = re.search(pattern, text)
if match:
    print("Pattern found!")
else:
    print("Pattern not found!")

In this example, we define a pattern apple and a text string I love apples. We use the re.search() function to search for the pattern within the text. If a match is found, we print "Pattern found!"; otherwise, we print "Pattern not found!".

Regular expressions provide a wide range of special characters and syntax for specifying patterns. Some common examples include:

  • . matches any character except a newline.
  • ^ matches the start of a string.
  • $ matches the end of a string.
  • [] defines a character set.
  • * matches zero or more occurrences of the preceding element.
  • + matches one or more occurrences of the preceding element.
  • ? matches zero or one occurrence of the preceding element.

Capturing Groups and Substitution:

Regular expressions also allow us to capture specific parts of a matched pattern and perform substitutions within strings.

Let's consider an example to demonstrate capturing groups and substitution:

import re

pattern = r"(\d{2})-(\d{2})-(\d{4})"

text = "Date: 25-12-2022"

match = re.search(pattern, text)
if match:
    day = match.group(1)
    month = match.group(2)
    year = match.group(3)
    print(f"Day: {day}, Month: {month}, Year: {year}")
else:
    print("No match found!")

# Substitution
new_text = re.sub(pattern, r"\2/\1/\3", text)
print(new_text)  # Output: "Date: 12/25/2022"

In this example, we define a pattern (\d{2})-(\d{2})-(\d{4}) to match a date in the format of dd-mm-yyyy. We use parentheses to capture groups within the pattern. The re.search() function finds the first match in the text.

We then access the captured groups using match.group(n) and print the day, month, and year separately.

Additionally, we can use the re.sub() function to perform substitutions within the text. In this case, we substitute the matched date pattern with a different format using the capture groups. The resulting string new_text contains the substituted text.