Skip to content

Commit

Permalink
Add encoding tests. (#2)
Browse files Browse the repository at this point in the history
Add encoding for files without BOM.
Add files sorting.
Add allow comments option.
Add similar/total files sizes.

Co-authored-by: Ivan Ivon <ivan.ivon@zerto.com>
  • Loading branch information
i2van and Ivan Ivon authored May 1, 2021
1 parent 7bc3e4a commit aa121f1
Show file tree
Hide file tree
Showing 39 changed files with 520 additions and 164 deletions.
4 changes: 2 additions & 2 deletions Deploy/buildlpx.cmd
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
@set version=6.7.0
@set version=6.8.0
@set zip="%ProgramFiles%\7-Zip\7z.exe"
@set output="CsvLINQPadDriver.%version%.lpx6"

del %output%

set releaseDir=..\bin\Release\netcoreapp3.1

%zip% a -tzip -mx=9 "%output%" header.xml %releaseDir%\*.dll %releaseDir%\*Connection.png ..\README.md
%zip% a -tzip -mx=9 "%output%" header.xml %releaseDir%\*.dll %releaseDir%\*Connection.png ..\README.md ..\LICENSE

@echo Package %output% created.
@pause
24 changes: 18 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@
* [Usage](#usage)
* [Configuration Options](#configuration-options)
* [General](#general)
* [Files](#files)
* [Format](#format)
* [Memory](#memory)
* [Generation](#generation)
Expand Down Expand Up @@ -145,21 +146,28 @@ CSV context can be added to LINQPad 6 same way as any other context.

- Click `Add connection`.
- Select `CSV Context Driver` and click `Next`.
- Enter CSV file names or Drag&Drop (Ctrl adds files) from Explorer. Optionally configure other options.
- Enter CSV file names or Drag&Drop (`Ctrl` adds files) from Explorer. Optionally configure other options.
- Query your data.

## Configuration Options ##

### General ###

- **CSV files** - list of CSV files and directories. Type one file/dir per line or Drag&Drop (Ctrl adds files) from explorer. Supports special wildcards: `*` and `**`.
- **CSV files** - list of CSV files and directories. Type one file/dir per line or Drag&Drop (`Ctrl` adds files) from explorer. Supports special wildcards: `*` and `**`.
- `c:\x\*.csv` - all files in folder `c:\x`.
- `c:\x\**.csv` - all files in folder `c:\x` and all sub-directories.

- No BOM encoding - specifies encoding for files without [BOM](https://en.wikipedia.org/wiki/Byte_order_mark). `UTF-8` is default.
### Files ###

- Order by - specifies files sort order. Affects similar files order.

### Format ###

- CSV separator - character used to separate columns in files. Can be `,`, `\t`, etc. Auto-detected if empty.
- Ignore files with invalid format - files with strange content not similar to CSV format will be ignored.
- Allow comments - lines starting with `#` will be ignored.

### Memory ###

Expand All @@ -170,7 +178,7 @@ CSV context can be added to LINQPad 6 same way as any other context.

### Generation ###

- Generate single class for similar files - single class will be generated for similar files which allows to query them as single one. Might not work well for files with relations.
- Generate single class for similar files - single class will be generated for similar files which allows to query them as a single one. Might not work well for files with relations.
- String comparison - string comparison for `Equals` and `GetHashCode` methods.

### Relations ###
Expand Down Expand Up @@ -261,7 +269,7 @@ string this[int index] { get; set; }
string this[string index] { get; set; }
```

See below.
See [properties access](#properties-access) below.

> Relations are not participated.

Expand Down Expand Up @@ -299,7 +307,7 @@ Authors.First()

### Extension Methods ###

Driver provides extension methods for converting string to nullable types:
Driver provides extension methods for converting string to `T?`. `CultureInfo.InvariantCulture` is used by default.

```csharp
// Bool.
Expand All @@ -311,6 +319,9 @@ int? ToInt(CultureInfo? cultureInfo = null);
// Long.
long? ToLong(CultureInfo? cultureInfo = null);

// Float.
float? ToFloat(CultureInfo? cultureInfo = null);

// Double.
double? ToDouble(CultureInfo? cultureInfo = null);

Expand Down Expand Up @@ -353,8 +364,9 @@ TimeSpan? ToTimeSpan(

## Known Issues ##

- Default encoding for files without BOM is UTF-8.
- Some strange Unicode characters in column names may cause errors in generated data context source code.
- Writing changed objects back to CSV is not directly supported, there is no `SubmitChanges()` . But you can use LINQPad's `Util.WriteCsv`.
- Writing changed objects back to CSV is not directly supported, there is no `SubmitChanges()`. But you can use LINQPad's `Util.WriteCsv`.
- Similar files single class generation does not detect relations correctly. However, you can query over related multiple files.

## Authors ##
Expand Down
23 changes: 14 additions & 9 deletions Src/CsvLINQPadDriver/CodeGen/CsvCSharpCodeGenerator.cs
Original file line number Diff line number Diff line change
Expand Up @@ -33,24 +33,27 @@ private CsvCSharpCodeGenerator(string contextNameSpace, string contextTypeName,
_properties = properties;
}

public record TypeCodeResult(string TypeName, string Code, string CodeName, string FilePath);
public record Result(string Code, IReadOnlyCollection<IGrouping<string, TypeCodeResult>> CodeGroups);

// ReSharper disable once RedundantAssignment
public static (string Code, IReadOnlyCollection<IGrouping<string, (string Type, string Code, string CodeName)>> CodeGroups)
GenerateCode(CsvDatabase db, ref string nameSpace, ref string typeName, ICsvDataContextDriverProperties props) =>
public static Result GenerateCode(CsvDatabase db, ref string nameSpace, ref string typeName, ICsvDataContextDriverProperties props) =>
new CsvCSharpCodeGenerator(nameSpace, typeName = DefaultContextTypeName, props).GenerateSrcFile(db);

private (string, IReadOnlyCollection<IGrouping<string, (string Type, string Code, string CodeName)>>) GenerateSrcFile(CsvDatabase csvDatabase)
private Result GenerateSrcFile(CsvDatabase csvDatabase)
{
var csvTables = csvDatabase.Tables;

var groups = csvTables
.Select(table => GenerateTableRowDataTypeClass(table, _properties.HideRelationsFromDump, _properties.StringComparison))
.GroupBy(typeCode => typeCode.Type)
.GroupBy(typeCode => typeCode.TypeName)
.ToImmutableList();

return ($@"using System;
using System.Linq;
return new Result($@"using System;
using System.Collections.Generic;
using CsvLINQPadDriver;
namespace {_contextNameSpace}
{{
/// <summary>CSV Data Context</summary>
Expand All @@ -67,6 +70,8 @@ public class {_contextTypeName} : {typeof(CsvDataContextBase).GetCodeTypeClassNa
{GetBoolConst(_properties.IsStringInternEnabled)},
{GetBoolConst(_properties.IsCacheEnabled)},
{table.CsvSeparator.AsValidCSharpCode()},
{nameof(NoBomEncoding)}.{_properties.NoBomEncoding},
{GetBoolConst(_properties.AllowComments)},
{table.FilePath.AsValidCSharpCode()},
new {typeof(CsvColumnInfoList<>).GetCodeTypeClassName(GetClassName(table))} {{
{string.Join(string.Empty, table.Columns.Select(c => $@"{{ {c.Index}, x => x.{c.CodeName} }}, "))}
Expand All @@ -88,12 +93,12 @@ static string GetBoolConst(bool val) =>
val ? "true" : "false";
}

private static (string Type, string Code, string CodeName) GenerateTableRowDataTypeClass(CsvTable table, bool hideRelationsFromDump, StringComparison stringComparison)
private static TypeCodeResult GenerateTableRowDataTypeClass(CsvTable table, bool hideRelationsFromDump, StringComparison stringComparison)
{
var className = GetClassName(table);
var properties = table.Columns.Select(GetPropertyName).ToImmutableList();

return (className, $@"
return new TypeCodeResult(className, $@"
public sealed record {className} : {typeof(ICsvRowBase).GetCodeTypeClassName()}
{{{string.Join(string.Empty, table.Columns.Select(csvColumn => $@"
public string {GetPropertyName(csvColumn)} {{ get; set; }}"))}
Expand All @@ -106,7 +111,7 @@ public sealed record {className} : {typeof(ICsvRowBase).GetCodeTypeClassName()}
[{typeof(HideFromDumpAttribute).GetCodeTypeClassName()}]" : string.Empty)}
public IEnumerable<{csvRelation.TargetTable.GetCodeRowClassName()}> {csvRelation.CodeName} {{ get; set; }}")
)}
}}", table.CodeName!);
}}", table.CodeName!, table.FilePath);

static string GetPropertyName(ICsvNames csvColumn) =>
csvColumn.CodeName!;
Expand Down
15 changes: 10 additions & 5 deletions Src/CsvLINQPadDriver/CodeGen/CsvTableBase.cs
Original file line number Diff line number Diff line change
Expand Up @@ -21,21 +21,26 @@ public abstract class CsvTableBase<TRow> : CsvTableBase, IEnumerable<TRow>
{
private static CsvRowMappingBase<TRow>? _cachedCsvRowMappingBase;

private char CsvSeparator { get; }
private readonly char _csvSeparator;
private readonly NoBomEncoding _noBomEncoding;
private readonly bool _allowComments;

protected string FilePath { get; }
protected readonly string FilePath;

protected CsvTableBase(bool isStringInternEnabled, char csvSeparator, string filePath, IEnumerable<CsvColumnInfo> propertiesInfo, Action<TRow> relationsInit)
protected CsvTableBase(bool isStringInternEnabled, char csvSeparator, NoBomEncoding noBomEncoding, bool allowComments, string filePath, IEnumerable<CsvColumnInfo> propertiesInfo, Action<TRow> relationsInit)
: base(isStringInternEnabled)
{
CsvSeparator = csvSeparator;
_csvSeparator = csvSeparator;
_noBomEncoding = noBomEncoding;
_allowComments = allowComments;

FilePath = filePath;

_cachedCsvRowMappingBase ??= new CsvRowMappingBase<TRow>(propertiesInfo, relationsInit);
}

protected IEnumerable<TRow> ReadData() =>
FileUtils.CsvReadRows(FilePath, CsvSeparator, IsStringInternEnabled, _cachedCsvRowMappingBase!);
FileUtils.CsvReadRows(FilePath, _csvSeparator, IsStringInternEnabled, _noBomEncoding, _allowComments, _cachedCsvRowMappingBase!);

// ReSharper disable once UnusedMember.Global
public abstract IEnumerable<TRow> WhereIndexed(Func<TRow, string> getProperty, string propertyName, params string[] values);
Expand Down
4 changes: 2 additions & 2 deletions Src/CsvLINQPadDriver/CodeGen/CsvTableEnumerable.cs
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,8 @@ namespace CsvLINQPadDriver.CodeGen
internal class CsvTableEnumerable<TRow> : CsvTableBase<TRow>
where TRow : ICsvRowBase, new()
{
public CsvTableEnumerable(bool isStringInternEnabled, char csvSeparator, string filePath, IEnumerable<CsvColumnInfo> propertiesInfo, Action<TRow> relationsInit)
: base(isStringInternEnabled, csvSeparator, filePath, propertiesInfo, relationsInit)
public CsvTableEnumerable(bool isStringInternEnabled, char csvSeparator, NoBomEncoding noBomEncoding, bool allowComments, string filePath, IEnumerable<CsvColumnInfo> propertiesInfo, Action<TRow> relationsInit)
: base(isStringInternEnabled, csvSeparator, noBomEncoding, allowComments, filePath, propertiesInfo, relationsInit)
{
}

Expand Down
7 changes: 5 additions & 2 deletions Src/CsvLINQPadDriver/CodeGen/CsvTableFactory.cs
Original file line number Diff line number Diff line change
Expand Up @@ -10,12 +10,15 @@ public static CsvTableBase<TRow> CreateTable<TRow>(
bool isStringInternEnabled,
bool isCacheEnabled,
char csvSeparator,
NoBomEncoding noBomEncoding,
bool allowComments,
string filePath,
IEnumerable<CsvColumnInfo> propertiesInfo,
Action<TRow> relationsInit)
where TRow : ICsvRowBase, new() =>
isCacheEnabled
? (CsvTableBase<TRow>)new CsvTableList<TRow>(isStringInternEnabled, csvSeparator, filePath, propertiesInfo, relationsInit)
: new CsvTableEnumerable<TRow>(isStringInternEnabled, csvSeparator, filePath, propertiesInfo, relationsInit);
// ReSharper disable once RedundantCast
? (CsvTableBase<TRow>)new CsvTableList<TRow>(isStringInternEnabled, csvSeparator, noBomEncoding, allowComments, filePath, propertiesInfo, relationsInit)
: new CsvTableEnumerable<TRow>(isStringInternEnabled, csvSeparator, noBomEncoding, allowComments, filePath, propertiesInfo, relationsInit);
}
}
6 changes: 3 additions & 3 deletions Src/CsvLINQPadDriver/CodeGen/CsvTableList.cs
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,8 @@ internal class CsvTableList<TRow> : CsvTableBase<TRow>, IList<TRow>
private readonly IDictionary<string, ILookup<string, TRow>> _indices = new Dictionary<string, ILookup<string, TRow>>();
private readonly Lazy<IList<TRow>> _dataCache;

public CsvTableList(bool isStringInternEnabled, char csvSeparator, string filePath, IEnumerable<CsvColumnInfo> propertiesInfo, Action<TRow> relationsInit)
: base(isStringInternEnabled, csvSeparator, filePath, propertiesInfo, relationsInit) =>
public CsvTableList(bool isStringInternEnabled, char csvSeparator, NoBomEncoding noBomEncoding, bool allowComments, string filePath, IEnumerable<CsvColumnInfo> propertiesInfo, Action<TRow> relationsInit)
: base(isStringInternEnabled, csvSeparator, noBomEncoding, allowComments, filePath, propertiesInfo, relationsInit) =>
_dataCache = new Lazy<IList<TRow>>(() => ReadData().Cache($"{typeof(TRow).Name}:{FilePath}"));

private IList<TRow> DataCache =>
Expand All @@ -30,7 +30,7 @@ public override IEnumerable<TRow> WhereIndexed(Func<TRow, string> getProperty, s
_indices.Add(propertyName, propertyIndex);
}

var result = values.SelectMany(value => propertyIndex[value]);
var result = values.SelectMany(value => propertyIndex![value]);

return values.Length > 1 ? result.Distinct() : result;
}
Expand Down
54 changes: 46 additions & 8 deletions Src/CsvLINQPadDriver/ConnectionDialog.xaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,22 +2,28 @@
xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
xmlns:system="clr-namespace:System;assembly=System.Runtime"
xmlns:wpf="clr-namespace:CsvLINQPadDriver.Wpf"
Title="CSV Files Connection"
Icon="Connection.ico"
Background="{x:Static SystemColors.ControlBrush}"
Width="640"
Height="520"
Width="680"
Height="580"
WindowStartupLocation="CenterScreen"
FocusManager.FocusedElement="{Binding ElementName=FilesTextBox}"
Loaded="ConnectionDialog_OnLoaded">

<Window.Resources>
<ObjectDataProvider x:Key="NoBomEncodingData" MethodName="GetValues" ObjectType="{x:Type wpf:NoBomEncodingEnumObjectDataSource}" />
<ObjectDataProvider x:Key="FilesOrderByData" MethodName="GetValues" ObjectType="{x:Type wpf:FilesOrderByEnumObjectDataSource}" />
</Window.Resources>

<Window.CommandBindings>
<CommandBinding
Command="ApplicationCommands.Paste"
Executed="PasteAndGoCommandBinding_OnExecuted"
CanExecute="PasteAndGoCommandBinding_OnCanExecute"/>
<CommandBinding
Command="ApplicationCommands.Help"
Command="ApplicationCommands.Help"
Executed="Help_OnExecuted"
CanExecute="Help_OnCanExecute"/>
</Window.CommandBindings>
Expand All @@ -34,13 +40,30 @@
</Grid.RowDefinitions>

<DockPanel Grid.Row="0">
<Label Padding="0,0,0,3" Focusable="False" Content="CSV _files. Drag&amp;Drop (Ctrl adds files) or type one file/directory per line" DockPanel.Dock="Top" Target="{Binding ElementName=FilesTextBox}" />
<DockPanel DockPanel.Dock="Top" LastChildFill="True" VerticalAlignment="Center" KeyboardNavigation.TabNavigation="Local">
<Label Padding="0,3,0,3" Content="CSV _files. Drag&amp;Drop (Ctrl adds files) or type one file/directory per line." DockPanel.Dock="Left" Target="{Binding ElementName=FilesTextBox}" />
<Label Padding="0,3,0,3" Content=" No _BOM encoding" DockPanel.Dock="Left" Target="{Binding ElementName=NoBomEncodingComboBox}" />
<TextBlock DockPanel.Dock="Right" Padding="6 3 3 0" KeyboardNavigation.TabIndex="2">
<Hyperlink TextDecorations="" NavigateUri="https://en.wikipedia.org/wiki/Byte_order_mark" Command="ApplicationCommands.Help">
<TextBlock Text="?" ToolTip="BOM on Wikipedia" />
</Hyperlink>
</TextBlock>
<ComboBox
Name="NoBomEncodingComboBox"
Margin="3,0,0,3"
ToolTip="Encoding for files without BOM"
SelectedValue="{Binding NoBomEncoding}"
SelectedValuePath="Item1"
DisplayMemberPath="Item2"
ItemsSource="{Binding Source={StaticResource NoBomEncodingData}}"
KeyboardNavigation.TabIndex="1"/>
</DockPanel>
<TextBox
Name="FilesTextBox"
AcceptsReturn="True"
AcceptsReturn="True"
HorizontalScrollBarVisibility="Auto"
VerticalScrollBarVisibility="Auto"
Text="{Binding Files, UpdateSourceTrigger=PropertyChanged}"
Text="{Binding Files, UpdateSourceTrigger=PropertyChanged}"
ToolTip="CSV files. Drag&amp;Drop (Ctrl adds files) or type one file per line. Supports mask '*.csv' or recursive '**.csv'"
AllowDrop="True"
PreviewDragEnter="FilesTextBox_DragEnter"
Expand All @@ -49,13 +72,28 @@
</DockPanel>

<StackPanel Grid.Row="1">
<GroupBox Header="Files">
<StackPanel Margin="2" Height="Auto">
<DockPanel VerticalAlignment="Center">
<Label Padding="0 3 4 0" Content="_Order by"/>
<ComboBox
ToolTip="Files sort order. Affects similar files order"
SelectedValue="{Binding FilesOrderBy}"
SelectedValuePath="Item1"
DisplayMemberPath="Item2"
ItemsSource="{Binding Source={StaticResource FilesOrderByData}}"
/>
</DockPanel>
</StackPanel>
</GroupBox>
<GroupBox Header="Format">
<StackPanel Margin="2" Height="Auto">
<DockPanel VerticalAlignment="Center">
<Label Padding="0 0 4 0" Content="CSV _separator (\t for tab). Auto-detected if empty"/>
<TextBox MaxLength="6" MaxLines="1" Text="{Binding CsvSeparator}" ToolTip="Character used to separate columns in CSV file. Separator is auto-detected for each file if empty"/>
</DockPanel>
<CheckBox IsChecked="{Binding IgnoreInvalidFiles}" Content="Ignore files _with invalid format" ToolTip="Ignore files with invalid format"/>
<CheckBox IsChecked="{Binding AllowComments}" Content="_Allow comments" ToolTip="Ignore lines starting with #"/>
</StackPanel>
</GroupBox>
<GroupBox Header="Memory">
Expand All @@ -73,8 +111,8 @@
<!-- ReSharper disable once MarkupAttributeTypo -->
<Label Padding="0 3 4 0" Content="S_tring comparison" DockPanel.Dock="Left" Target="{Binding ElementName=StringComparisonComboBox}"/>
<TextBlock DockPanel.Dock="Right" Padding="5 3 0 0" KeyboardNavigation.TabIndex="2">
<Hyperlink TextDecorations="" NavigateUri="https://docs.microsoft.com/en-us/dotnet/api/system.stringcomparison" Command="ApplicationCommands.Help">
<TextBlock Text="?"/>
<Hyperlink TextDecorations="" NavigateUri="https://docs.microsoft.com/en-us/dotnet/api/system.stringcomparison#fields" Command="ApplicationCommands.Help">
<TextBlock Text="?" ToolTip="StringComparison on Microsoft Docs" />
</Hyperlink>
</TextBlock>
<ComboBox Name="StringComparisonComboBox"
Expand Down
6 changes: 4 additions & 2 deletions Src/CsvLINQPadDriver/ConnectionDialog.xaml.cs
Original file line number Diff line number Diff line change
Expand Up @@ -81,8 +81,10 @@ private void PasteAndGoCommandBinding_OnCanExecute(object sender, CanExecuteRout
e.CanExecute = Clipboard.ContainsText(TextDataFormat.Text) ||
Clipboard.ContainsText(TextDataFormat.UnicodeText);

private void Help_OnExecuted(object sender, ExecutedRoutedEventArgs e) =>
Process.Start(new ProcessStartInfo(((Hyperlink)e.OriginalSource).NavigateUri.OriginalString){ UseShellExecute = true });
private void Help_OnExecuted(object sender, ExecutedRoutedEventArgs e)
{
using var process = Process.Start(new ProcessStartInfo(((Hyperlink) e.OriginalSource).NavigateUri.OriginalString) { UseShellExecute = true });
}

private void Help_OnCanExecute(object sender, CanExecuteRoutedEventArgs e) =>
e.CanExecute = true;
Expand Down
Loading

0 comments on commit aa121f1

Please sign in to comment.