Scala-regex-collection is a pure scala regex collection
The latest version of the library is available for Scala 2.12, 2.13, 3 and ScalaJS.
libraryDependencies += "com.github.gekomad" %% "scala-regex-collection" % "2.0.1"
You can use defined patterns or you can define yours
- Email ($abc@def.c)
- Email1 (abc@def.com)
-
Email simple (
$@%.$ )
Ciphers
- UUID (1CC3CCBB-C749-3078-E050-1AACBE064651)
- MD5 (23f8e84c1f4e7c8814634267bd456194)
- SHA1 (1c18da5dbf74e3fc1820469cf1f54355b7eec92d)
- SHA256 ( 000020f89134d831f48541b2d8ec39397bc99fccf4cc86a3861257dbe6d819d1)
URL, IP, MAC Address
- IP (10.192.168.1)
- IP_6 (2001:db8:a0b:12f0::1)
- URLs (http://abc.def.com)
- Youtube (https://www.youtube .com/watch?v=9bZkp7q19f0)
- Facebook (https://www .facebook.com/thesimpsons - https://www.facebook.com/pages/)
- Twitter (https://twitter .com/rtpharry)
- MAC Address (fE:dC:bA:98:76:54)
HEX
- HEX (#F0F0F0 - 0xF0F0F0)
Bitcoin
- Bitcon Address ( 3Nxwenay9Z8Lc9JBiywExpnEFiLp6Afp8v)
Phone numbers
- US phone number (555-555-5555 - (416) 555-3456)
- Italian Mobile Phone (+393471234561 - 3381234561)
- Italian Phone (02 645566 - 02/583725 - 02-583725)
Date time
- 24 Hours time (23:50:00)
- LocalDateTime (2000-12-31T11:21:19)
- LocalDate (2000-12-31)
- LocalTime (11:21:19)
- OffsetDateTime (2011-12-03T10:15:30+01:00)
- OffsetTime (10:15:30+01:00)
- ZonedDateTime (2016-12-02T11:15:30-05:00)
- MDY (1/12/1902 - 12/31/1902)
- MDY2 (1-12-1902)
- MDY3 (01/01/1900 - 12/31/9999)
- MDY4 (01-12-1902 - 12-31-2018)
- DMY (1/12/1902)
- DMY2 (12-31-1902 - 1-12-1902)
- DMY3 (01/12/1902 - 01/12/1902)
- DMY4 (01-12-1902 - 01-12-1902)
- Time (8am - 8 pm - 11 PM - 8:00 am)
Crontab
- Crontab expression (5 4 * * *)
Codes
- Italian fiscal code (BDAPPP14A01A001R)
- Italian VAT code (13297040362)
- Italian Iban (IT28 W800 0000 2921 0064 5211 151 - IT28W8000000292100645211151)
- US states (FL - CA)
- US states1 (Connecticut - Colorado)
- US zip code (43802)
- US streets (123 Park Ave Apt 123 New York City, NY 10002)
- US street numbers (P.O. Box 432)
- Italian zip code (23887)
- German streets (Mühlenstr. 33)
Concurrency
- USD Currency ($1.00 - 1,500.00)
- EUR Currency (0,00 € - 133,89 EUR - 133,89 EURO)
- YEN Currency (¥1.00 - 15.00 - ¥-1213,120.00)
Strings
- Not ASCII (テスト。)
- Single char ASCII (A)
- A-Z string (abc)
- String and number (a1)
- ASCII string (a1%)
Logs
- Apache error ([Fri Dec 16 02:25:55 2005] [error] [client 1.2.3.4] Client sent malformed Host header)
Numbers
- Number1 (99.99 - 1.1 - .99)
- Unsigned32 (0 - 122 - 4294967295)
- Signed (-10 - +122 - 99999999999999999999999999)
- Percentage (10%)
- Scientific (-2.384E-03)
- Single number (1)
- Celsius (-2.2 °C)
- Fahrenheit (-2.2 °F)
Coordinates
- Coordinate (N90.00.00 E180.00.00)
- Coordinate1 (45°23'36.0" N 10°33'48.0" E)
- Coordinate2 (12:12:12.223546"N - 15:17:6"S - 12°30'23.256547"S)
Programming
- Comments (/* foo */)
Credit Cards
- Visa (/* 4111111111111 */)
- Master Card (/* 5500000000000004 */)
- American Express (/* 340000000000009 */)
- Diners Club (/* 30000000000004 */)
- Discover (/* 6011000000000004 */)
- JCB (/* 3588000000000009 */)
Returns Option[String]
with the matched string
import com.github.gekomad.regexcollection._
import com.github.gekomad.regexcollection.Validate.validate
import java.time.LocalDateTime
assert(validate[Email]("foo@bar.com") == Some("foo@bar.com"))
assert(validate[Email]("baz") == None)
assert(validate[MD5]("fc42757b4142b0474d35fcddb228b304") == Some("fc42757b4142b0474d35fcddb228b304"))
assert(validate[LocalDateTime]("2000-12-31T11:21:19") == Some("2000-12-31T11:21:19"))
Example extracting all emails from a string
import com.github.gekomad.regexcollection.Email
import com.github.gekomad.regexcollection.Validate.findAll
assert(findAll[Email]("bar abc@def.com hi hello bar@foo.com") == List("abc@def.com", "bar@foo.com"))
assert(findAll[Email]("sdsdsd@sdf.com") == List("sdsdsd@sdf.com"))
assert(findAll[Email]("ddddd") == List())
Example extracting first email from a string
trait Bar
import com.github.gekomad.regexcollection.Validate.findFirst
import com.github.gekomad.regexcollection.Validate.findFirstIgnoreCase
import com.github.gekomad.regexcollection.Collection.Validator
implicit val myValidator = Validator[Bar]("""Bar@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*""")
assert(findFirstIgnoreCase[Bar]("bar abc@google.com hi hello bar@yahoo.com 123 Bar@foo.com") == Some("bar@yahoo.com"))
assert(findFirst[Bar]("bar abc@google.com hi hello Bar@yahoo.com 123 bar@foo.com") == Some("Bar@yahoo.com"))
Returns the current pattern used for that type, for example for Email type:
import com.github.gekomad.regexcollection.Email
import com.github.gekomad.regexcollection.Validate.regexp
assert(regexp[Email] == """[a-zA-Z0-9\.!#$%&'*+/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*""")
It's possible modify the default pattern for all types, example for Email
import com.github.gekomad.regexcollection.Email
import com.github.gekomad.regexcollection.Validate.validate
import com.github.gekomad.regexcollection.Collection.Validator
val email = "abc,a@%.d"
//using default pattern doesn't match the string
assert(validate[Email](email) == None)
//using custom pattern the string is matched
implicit val validator = Validator[Email](""".+@.+\..+""")
assert(validate[Email](email) == Some("abc,a@%.d"))
Defining a pattern for Bar type
trait Bar
import com.github.gekomad.regexcollection.Validate.validate
import com.github.gekomad.regexcollection.Validate.validateIgnoreCase
import com.github.gekomad.regexcollection.Collection.Validator
// pattern for strings starting with "Bar."
implicit val myValidator = Validator[Bar]("Bar.*")
assert(validate[Bar]("a string") == None)
assert(validate[Bar]("Bar foo") == Some("Bar foo"))
assert(validate[Bar]("bar foo") == None)
assert(validateIgnoreCase[Bar]("bar foo") == Some("bar foo"))
Retrieve all emails using findAll and findAllCaseSensitive
trait Bar
import com.github.gekomad.regexcollection.Collection.Validator
import com.github.gekomad.regexcollection.Validate._
//get all Alice's emails
implicit val myValidator = Validator[Bar]("""Alice@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*""")
val s = "bar abc@google.com hi hello Alice@yahoo.com 123 alice@foo.com"
assert(findAll[Bar](s) == List("Alice@yahoo.com"))
assert(findAllIgnoreCase[Bar](s) == List("Alice@yahoo.com", "alice@foo.com"))
Instead of using a regular expression to match a string it's possible defining a function pattern
Example matching even numbers
trait Foo
import com.github.gekomad.regexcollection.Validate.validate
import com.github.gekomad.regexcollection.Collection.Validator
def even: String => Option[String] = { s =>
{
for {
i <- scala.util.Try(s.toInt)
if (i % 2 == 0)
} yield Some(s)
}.getOrElse(None)
}
implicit val validator: Validator[Foo] = Validator[Foo](even)
assert(validate[Foo]("42") == Some("42"))
assert(validate[Foo]("41") == None)
assert(validate[Foo]("hello") == None)
For bugs, questions and discussions please use Github Issues.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
To regexlib.com