Source of input files for regex-redux? #268

rschwietzke · 2022-05-24T08:18:59Z

rschwietzke
May 24, 2022

Where is the file 2500000_in for regex-redux coming from? Couldn't find it. I don't have and want a full test setup locally, so I assume it is magically generated, isn't it?

JZerf · 2022-06-15T00:25:27Z

JZerf
Jun 15, 2022

The input files used for the regex-redux problem (and also the knucleotide problem too) are generated by using one of the programs from the fasta problem with an argument of the desired size. The fasta programs generate their output through a combination of repeating a small sequence many times and generating sequences using a pseudorandom number generator.

Most modern Unix-like systems include Wget or cURL and Python 3 so on these systems you can generate this file pretty easily with the Python #1 fasta program using the following commands in a terminal:

wget https://github.com/hanabi1224/Programming-Language-Benchmarks/raw/main/bench/algorithm/fasta/1.py
python3 1.py 2500000 > 2500000_in

or

curl -L https://github.com/hanabi1224/Programming-Language-Benchmarks/raw/main/bench/algorithm/fasta/1.py > 1.py
python3 1.py 2500000 > 2500000_in

The Python code may take around half a minute to generate a file of this size. If you want to use larger file sizes, you may want to use one of the faster programs listed at https://programming-language-benchmarks.vercel.app/problem/fasta (or also https://benchmarksgame-team.pages.debian.net/benchmarksgame/performance/fasta.html) but you may need to make sure you have a suitable compiler/interpreter installed first.

1 reply

rschwietzke Jun 15, 2022
Author

Thanks for sharing. I will turn to that problem and code next to give me another puzzle.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Source of input files for regex-redux? #268

{{title}}

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

Source of input files for regex-redux? #268

rschwietzke May 24, 2022

Replies: 1 comment · 1 reply

JZerf Jun 15, 2022

rschwietzke Jun 15, 2022 Author

rschwietzke
May 24, 2022

Replies: 1 comment 1 reply

JZerf
Jun 15, 2022

rschwietzke Jun 15, 2022
Author