Skip to content

Commit

Permalink
Merge pull request #4 from PGS62/KeepQuotes
Browse files Browse the repository at this point in the history
Keep quotes
  • Loading branch information
PGS62 authored Oct 26, 2023
2 parents 3e6e56d + 74586b7 commit 47370aa
Show file tree
Hide file tree
Showing 21 changed files with 506 additions and 326 deletions.
14 changes: 7 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -89,7 +89,7 @@ Public Function CSVRead(ByVal FileName As String, Optional ByVal ConvertTypes As
|Argument|Description|
|:-------|:----------|
|`FileName`|The full name of the file, including the path, or else a URL of a file, or else a string in CSV format.|
|`ConvertTypes`|Controls whether fields in the file are converted to typed values or remain as strings, and sets the treatment of "quoted fields" and space characters.<br/><br/>`ConvertTypes` should be a string of zero or more letters from allowed characters `NDBETQ`.<br/><br/>The most commonly useful letters are:<br/>1) `N` number fields are returned as numbers (Doubles).<br/>2) `D` date fields (that respect `DateFormat`) are returned as Dates.<br/>3) `B` fields matching `TrueStrings` or `FalseStrings` are returned as Booleans.<br/><br/>`ConvertTypes` is optional and defaults to the null string for no type conversion. `TRUE` is equivalent to `NDB` and `FALSE` to the null string.<br/><br/>Three further options are available:<br/>4) `E` fields that match Excel errors are converted to error values. There are fourteen of these, including `#N/A`, `#NAME?`, `#VALUE!` and `#DIV/0!`.<br/>5) `T` leading and trailing spaces are trimmed from fields. In the case of quoted fields, this will not remove spaces between the quotes.<br/>6) `Q` conversion happens for both quoted and unquoted fields; otherwise only unquoted fields are converted.<br/><br/>For most files, correct type conversion can be achieved with `ConvertTypes` as a string which applies for all columns, but type conversion can also be specified on a per-column basis.<br/><br/>Enter an array (or range) with two columns or two rows, column numbers on the left/top and type conversion (subset of `NDBETQ`) on the right/bottom. Instead of column numbers, you can enter strings matching the contents of the header row, and a column number of zero applies to all columns not otherwise referenced.<br/><br/>For convenience when calling from VBA, you can pass an array of two element arrays such as `Array(Array(0,"N"),Array(3,""),Array("Phone",""))` to convert all numbers in a file into numbers in the return except for those in column 3 and in the column(s) headed "Phone".|
|`ConvertTypes`|Controls whether fields in the file are converted to typed values or remain as strings, and sets the treatment of "quoted fields" and space characters.<br/><br/>`ConvertTypes` should be a string of zero or more letters from allowed characters `NDBETQK`.<br/><br/>The most commonly useful letters are:<br/>1) `N` number fields are returned as numbers (Doubles).<br/>2) `D` date fields (that respect `DateFormat`) are returned as Dates.<br/>3) `B` fields matching `TrueStrings` or `FalseStrings` are returned as Booleans.<br/><br/>`ConvertTypes` is optional and defaults to the null string for no type conversion. `TRUE` is equivalent to `NDB` and `FALSE` to the null string.<br/><br/>Four further options are available:<br/>4) `E` fields that match Excel errors are converted to error values. There are fourteen of these, including `#N/A`, `#NAME?`, `#VALUE!` and `#DIV/0!`.<br/>5) `T` leading and trailing spaces are trimmed from fields. In the case of quoted fields, this will not remove spaces between the quotes.<br/>6) `Q` conversion happens for both quoted and unquoted fields; otherwise only unquoted fields are converted.<br/>7) `K` quoted fields are returned with their quotes kept in place.<br/><br/>For most files, correct type conversion can be achieved with `ConvertTypes` as a string which applies for all columns, but type conversion can also be specified on a per-column basis.<br/><br/>Enter an array (or range) with two columns or two rows, column numbers on the left/top and type conversion (subset of `NDBETQK`) on the right/bottom. Instead of column numbers, you can enter strings matching the contents of the header row, and a column number of zero applies to all columns not otherwise referenced.<br/><br/>For convenience when calling from VBA, you can pass an array of two element arrays such as `Array(Array(0,"N"),Array(3,""),Array("Phone",""))` to convert all numbers in a file into numbers in the return except for those in column 3 and in the column(s) headed "Phone".|
|`Delimiter`|By default, `CSVRead` will try to detect a file's delimiter as the first instance of comma, tab, semi-colon, colon or pipe found in the first 10,000 characters of the file, searching only outside of quoted regions and outside of date-with-time fields (since these contain colons). If it can't auto-detect the delimiter, it will assume comma. If your file includes a different character or string delimiter you should pass that as the `Delimiter` argument.<br/><br/>Alternatively, enter `FALSE` as the delimiter to treat the file as "not a delimited file". In this case the return will mimic how the file would appear in a text editor such as NotePad. The file will be split into lines at all line breaks (irrespective of double quotes) and each element of the return will be a line of the file.|
|`IgnoreRepeated`|Whether delimiters which appear at the start of a line, the end of a line or immediately after another delimiter should be ignored while parsing; useful for fixed-width files with delimiter padding between fields.|
|`DateFormat`|The format of dates in the file such as `Y-M-D` (the default), `M-D-Y` or `Y/M/D`. Also `ISO` for [ISO8601](https://en.wikipedia.org/wiki/ISO_8601) (e.g., 2021-08-26T09:11:30) or `ISOZ` (time zone given e.g. 2021-08-26T13:11:30+05:00), in which case dates-with-time are returned in UTC time.|
Expand All @@ -109,24 +109,23 @@ Public Function CSVRead(ByVal FileName As String, Optional ByVal ConvertTypes As
|`HeaderRow`|This by-reference argument is for use from VBA (as opposed to from Excel formulas). It is populated with the contents of the header row, with no type conversion, though leading and trailing spaces are removed.|



[source](https://github.com/PGS62/VBA-CSV/blob/c318365294420006e60f6dca3ca264eab3b02904/vba/VBA-CSV.xlsm/modCSVReadWrite.bas#L53-L552)

#### _CSVWrite_
Creates a comma-separated file on disk containing `Data`. Any existing file of the same name is overwritten. If successful, the function returns `FileName`, otherwise an "error string" (starts with `#`, ends with `!`) describing what went wrong.
```vba
Public Function CSVWrite(ByVal Data As Variant, Optional ByVal FileName As String, _
Optional ByVal QuoteAllStrings As Boolean = True, Optional ByVal DateFormat As String = "YYYY-MM-DD", _
Optional ByVal DateTimeFormat As String = "ISO", Optional ByVal Delimiter As String = ",", _
Optional ByVal Encoding As String = "ANSI", Optional ByVal EOL As String = vbNullString, _
Optional TrueString As String = "True", Optional FalseString As String = "False") As String
Optional ByVal QuoteAllStrings As Boolean = True, Optional ByVal DateFormat As String = "YYYY-MM-DD", _
Optional ByVal DateTimeFormat As String = "ISO", Optional ByVal Delimiter As String = ",", _
Optional ByVal Encoding As String = "ANSI", Optional ByVal EOL As String = vbNullString, _
Optional TrueString As String = "True", Optional FalseString As String = "False") As String
```

|Argument|Description|
|:-------|:----------|
|`Data`|An array of data, or an Excel range. Elements may be strings, numbers, dates, Booleans, empty, Excel errors or null values. `Data` typically has two dimensions, but if `Data` has only one dimension then the output file has a single column, one field per row.|
|`FileName`|The full name of the file, including the path. Alternatively, if `FileName` is omitted, then the function returns `Data` converted CSV-style to a string.|
|`QuoteAllStrings`|If `TRUE` (the default) then elements of `Data` that are strings are quoted before being written to file, other elements (Numbers, Booleans, Errors) are not quoted. If `FALSE` then the only elements of `Data` that are quoted are strings containing `Delimiter`, line feed, carriage return or double quote. In all cases, double quotes are escaped by another double quote.|
|`QuoteAllStrings`|If `TRUE` (the default) then elements of `Data` that are strings are quoted before being written to file, other elements (Numbers, Booleans, Errors) are not quoted. If `FALSE` then the only elements of `Data` that are quoted are strings containing `Delimiter`, line feed, carriage return or double quote. In both cases, double quotes are escaped by another double quote. If "Raw" then no strings are quoted. Use this option with care, the file written may not be in valid CSV format.|
|`DateFormat`|A format string that determines how dates, including cells formatted as dates, appear in the file. If omitted, defaults to `yyyy-mm-dd`.|
|`DateTimeFormat`|Format for datetimes. Defaults to `ISO` which abbreviates `yyyy-mm-ddThh:mm:ss`. Use `ISOZ` for ISO8601 format with time zone the same as the PC's clock. Use with care, daylight saving may be inconsistent across the datetimes in data.|
|`Delimiter`|The delimiter string, if omitted defaults to a comma. `Delimiter` may have more than one character.|
Expand All @@ -135,6 +134,7 @@ Public Function CSVWrite(ByVal Data As Variant, Optional ByVal FileName As Strin
|`TrueString`|How the Boolean value True is to be represented in the file. Optional, defaulting to "True".|
|`FalseString`|How the Boolean value False is to be represented in the file. Optional, defaulting to "False".|


[source](https://github.com/PGS62/VBA-CSV/blob/c318365294420006e60f6dca3ca264eab3b02904/vba/VBA-CSV.xlsm/modCSVReadWrite.bas#L3481-L3670)

# Errors
Expand Down
4 changes: 4 additions & 0 deletions testfiles/test_keep_quotes.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
"Col1","Col2","Col3","Col4"
1,"x","x",True
2,"y","y",False
3,"z","z",True
3 changes: 3 additions & 0 deletions vba/VBA-CSV.xlsm/AuditSheetComments.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,7 @@
Version Date Time Author Comment
247 26-Oct-2023 12:18 Philip Swannell Code comments only.
246 26-Oct-2023 11:57 Philip Swannell CSVWrite now supports "Raw" as value for QuoteAllStrings argument.
245 26-Oct-2023 09:49 Philip Swannell Added tests 272 to 274 to test behaviour of new "K" option to ConvertTypes. Updated "docstrings" for CSVRead and CSVWrite. Updated method RegisterCSVRead.
244 23-Oct-2023 16:34 Philip Swannell Rebased tests 178, 207 & 208. Necessary thanks to changes in file https://vincentarelbundock.github.io/Rdatasets/csv/carData/TitanicSurvival.csv. Example changes: top left element was null string is now text "rownames", yes and no previously appeared quoted, now unquoted.
243 23-Oct-2023 16:31 Philip Swannell Re-arranged file locations.
242 06-Mar-2023 16:56 Philip Swannell Deleted worksheets NotRFC4180, Demo, Col-by-col, Sheet1, Notes
Expand Down
6 changes: 5 additions & 1 deletion vba/VBA-CSV.xlsm/Formulas/CSVWriteTests.txt
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Address Formula
F5 =AND(Q8#)
O8 =CSVWrite(INDIRECT(E8),G8,F8,H8,I8,J8,K8,L8,M8,N8)
Q8:Q53 =O8:O53=P8:P53
Q8:Q57 =O8:O57=P8:P57
O9 =CSVWrite(INDIRECT(E9),G9,F9,H9,I9,J9,K9,L9,M9,N9)
O10 =CSVWrite(INDIRECT(E10),G10,F10,H10,I10,J10,K10,L10,M10,N10)
O11 =CSVWrite(INDIRECT(E11),G11,F11,H11,I11,J11,K11,L11,M11,N11)
Expand Down Expand Up @@ -55,3 +55,7 @@ O50 =CSVWrite(INDIRECT(E50),G50,F50,H50,I50,J50,K50,L50,M50,N50)
O51 =CSVWrite(INDIRECT(E51),G51,F51,H51,I51,J51,K51,L51,M51,N51)
O52 =CSVWrite(INDIRECT(E52),G52,F52,H52,I52,J52,K52,L52,M52,N52)
O53 =CSVWrite(INDIRECT(E53),G53,F53,H53,I53,J53,K53,L53,M53,N53)
O54 =CSVWrite(INDIRECT(E54),G54,F54,H54,I54,J54,K54,L54,M54,N54)
O55 =CSVWrite(INDIRECT(E55),G55,F55,H55,I55,J55,K55,L55,M55,N55)
O56 =CSVWrite(INDIRECT(E56),G56,F56,H56,I56,J56,K56,L56,M56,N56)
O57 =CSVWrite(INDIRECT(E57),G57,F57,H57,I57,J57,K57,L57,M57,N57)
30 changes: 6 additions & 24 deletions vba/VBA-CSV.xlsm/Formulas/Test.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,42 +4,24 @@ C3 =IF(OR(LEFT(INDEX(Tests[FileName],RowNo),4)="http",ISNUMBER(SEARCH(",",INDEX(
C5 =MATCH(TestNo,Tests[TestNo],0)
B7 =B13
E7 =E13
E7 =E13
F7 =F13
F7 =F13
G7 =G13
G7 =G13
H7 =H13
H7 =H13
I7 =I13
I7 =I13
J7 =J13
J7 =J13
K7 =K13
K7 =K13
L7 =L13
L7 =L13
M7 =M13
M7 =M13
N7 =N13
N7 =N13
O7 =O13
O7 =O13
P7 =P13
P7 =P13
Q7 =Q13
Q7 =Q13
R7 =R13
R7 =R13
S7 =S13
S7 =S13
T7 =T13
T7 =T13
U7 =U13
U7 =U13
V7 =V13
V7 =V13
W7:W284 =fill(" ",ROWS(Tests[TestNo])+7,1)
W7 =fill(" ",ROWS(Tests[TestNo])+7,1)
AB12 =IF(INDEX(Tests[HeaderRowNum],RowNo)=0,"#Not requested!",CSVRead(FileName,
FALSE,
INDEX(Tests[Delimiter],RowNo),
Expand All @@ -61,7 +43,7 @@ INDEX(Tests[DecimalSeparator],RowNo)))
X13 ="VBA code for Test"&TestNo
AB13 =ROWS(CallToCSVRead#)&" x "&COLUMNS(CallToCSVRead#)
B14 =FileSize(Folder&[@FileName])
X14:X40 =GenerateTestCode(TestNo,
X14:X39 =GenerateTestCode(TestNo,
INDEX(Tests[FileName],RowNo),
CallToCSVRead#,
unpack(INDEX(Tests[ConvertTypes],RowNo)),
Expand All @@ -82,9 +64,9 @@ INDEX(Tests[Encoding],RowNo),
INDEX(Tests[DecimalSeparator],RowNo),HeaderRow#)
Y14:Y59 =""
Y14:Y59 =""
Z14:Z743 =CSVRead(FileName,FALSE,FALSE,,,,,,,,,,,,,,INDEX(Tests[Encoding],RowNo))
AA14:AA743 =fill(" ",ROWS(Z14#),1)
AB14:AB743 =CSVRead(FileName,
Z14:Z17 =CSVRead(FileName,FALSE,FALSE,,,,,,,,,,,,,,INDEX(Tests[Encoding],RowNo))
AA14:AA17 =fill(" ",ROWS(Z14#),1)
AB14:AE17 =CSVRead(FileName,
unpack(INDEX(Tests[ConvertTypes],RowNo)),
INDEX(Tests[Delimiter],RowNo),
INDEX(Tests[IgnoreRepeated],RowNo),
Expand All @@ -102,7 +84,7 @@ INDEX(Tests[MissingStrings],RowNo),
"I'm missing!",
INDEX(Tests[Encoding],RowNo),
INDEX(Tests[DecimalSeparator],RowNo))
AB14:AB743 =CSVRead(FileName,
AB14:AE17 =CSVRead(FileName,
unpack(INDEX(Tests[ConvertTypes],RowNo)),
INDEX(Tests[Delimiter],RowNo),
INDEX(Tests[IgnoreRepeated],RowNo),
Expand Down
Binary file modified vba/VBA-CSV.xlsm/VBA-CSV.xlsm
Binary file not shown.
1 change: 0 additions & 1 deletion vba/VBA-CSV.xlsm/VBA/modCSVPerformanceLowLevel.bas
Original file line number Diff line number Diff line change
Expand Up @@ -340,7 +340,6 @@ End Sub
Private Sub SpeedTest_CastISO8601()

Const N As Long = 5000000
Dim Converted As Boolean
Dim DtOut As Date
Dim Expected As Date
Dim i As Long
Expand Down
Loading

0 comments on commit 47370aa

Please sign in to comment.