TSDSS is a comprehensive Python package for time series analysis and surrogate data generation. It provides a wide range of tools for statistical analysis, preprocessing, feature extraction, and surrogate data generation for both univariate and multivariate time series.
- Basic statistics (mean, std, skewness, kurtosis)
- Stationarity tests (ADF test, Ljung-Box test)
- Correlation analysis (Pearson, Spearman, Kendall)
- Spectral analysis
- Nonlinear analysis (Lyapunov exponent, phase space reconstruction)
- Entropy measures
- Missing value interpolation
- Outlier detection
- Normalization
- Resampling
- Feature extraction
- IAAFT (Iterative Amplitude Adjusted Fourier Transform)
- IAAFT+ (Enhanced IAAFT)
- IPFT (Iterative Phase-adjusted Fourier Transform)
- AIAAFT (Adaptive IAAFT)
- IAAWT (Iterative Amplitude Adjusted Wavelet Transform)
- Multivariate surrogate methods
- Bootstrap methods
Each filter has its own characteristics and use cases:
- Moving Average Filter: Simple and effective for reducing random noise
- Exponential Filter: Gives more weight to recent data points
- Savitzky-Golay Filter: Preserves higher moments of the data while smoothing
- Kalman Filter: Optimal for tracking time-varying signals
- Butterworth Filter: Frequency domain filtering with flat response
- Median Filter: Excellent for removing impulse noise and outliers
For multivariate time series, the multivariate_filter
function provides a unified interface to apply any of these filters to each dimension of the data. Key features:
- Supports all single-variable filtering methods
- Maintains correlations between dimensions
- Handles errors gracefully for each dimension
- Preserves the original data structure
pip install tsdss
TSDSS accepts the following input formats:
- NumPy arrays (1D for univariate, 2D for multivariate)
- Pandas Series (for univariate)
- Pandas DataFrame (for multivariate)
Example shapes:
- Univariate: (n_samples,) or (n_samples, 1)
- Multivariate: (n_samples, n_features)
import numpy as np
import pandas as pd
from tsdss import ts_statistics, plot_decomposition, calculate_entropy
# Basic time series statistics
ts = np.random.normal(0, 1, 1000)
stats = ts_statistics(ts)
print(stats)
# Plot time series decomposition
plot_decomposition(ts)
# Calculate entropy
entropy = calculate_entropy(ts)
print(f"Entropy: {entropy}")
from tsdss import interpolate_missing, detect_outliers, normalize_ts, resample_ts
# Handle missing values
ts = pd.Series([1, np.nan, 3, np.nan, 5])
ts_clean = interpolate_missing(ts, method='linear') # Options: linear, ffill, bfill, cubic, spline
# Detect outliers
ts = np.random.normal(0, 1, 1000)
outliers = detect_outliers(ts, method='zscore', threshold=3) # Options: zscore, iqr, mad
# Normalize data
ts_norm = normalize_ts(ts, method='zscore') # Options: zscore, minmax, robust
# Resample time series (requires datetime index)
dates = pd.date_range('2023-01-01', periods=100, freq='D')
ts = pd.Series(np.random.randn(100), index=dates)
ts_resampled = resample_ts(ts, freq='W', method='mean')
from tsdss import extract_time_features, extract_freq_features
# Extract time domain features
ts = np.random.normal(0, 1, 1000)
time_features = extract_time_features(ts)
print("Time domain features:", time_features)
# Extract frequency domain features
freq_features = extract_freq_features(ts)
print("Frequency domain features:", freq_features)
from tsdss import mutual_information, kendall_correlation
# Calculate mutual information
x = np.random.normal(0, 1, 1000)
y = 0.5 * x + np.random.normal(0, 1, 1000)
mi = mutual_information(x, y)
print(f"Mutual Information: {mi}")
# Calculate Kendall correlation
kendall = kendall_correlation(x, y)
print(f"Kendall Correlation: {kendall}")
from tsdss import (
iaaft, iaaft_plus, ipft, aiaaft,
multivariate_iaaft, block_bootstrap,
stationary_bootstrap
)
# Generate univariate surrogate data
ts = np.random.normal(0, 1, 1000)
# IAAFT method
surrogate_iaaft = iaaft(ts, n_iterations=1000, num_surrogates=1)[0]
# IAAFT+ method
surrogate_iaaft_plus = iaaft_plus(ts, n_iterations=1000, num_surrogates=1)[0]
# IPFT method
surrogate_ipft = ipft(ts, n_iterations=1000, num_surrogates=1)[0]
# Generate multivariate surrogate data
data = np.random.normal(0, 1, (1000, 3)) # 3-dimensional time series
mv_surrogate = multivariate_iaaft(data, max_iter=100, num_surrogates=1)[0]
# Bootstrap methods
block_samples = block_bootstrap(ts, block_length=50, num_bootstrap=100)
stat_samples = stationary_bootstrap(ts, mean_block_length=50, num_bootstrap=100)
from tsdss import dwt, idwt, iaawt
# Perform discrete wavelet transform
ts = np.random.normal(0, 1, 1024) # Length should be power of 2
coeffs = dwt(ts, level=3)
# Perform inverse wavelet transform
reconstructed = idwt(coeffs)
# Generate wavelet-based surrogate
surrogate = iaawt(ts, n_iterations=1000, num_surrogates=1)[0]
from tsdss import (
mvts_surrogate_s_transform,
mvts_surrogate_wavelet,
mvts_surrogate_pca,
copula_surrogate
)
# Generate multivariate data
data = np.random.normal(0, 1, (1000, 5))
# Different multivariate surrogate methods
surrogate_st = mvts_surrogate_s_transform(data, num_surrogates=1)[0]
surrogate_wavelet = mvts_surrogate_wavelet(data, num_surrogates=1)[0]
surrogate_pca = mvts_surrogate_pca(data, num_surrogates=1)[0]
surrogate_copula = copula_surrogate(data, num_surrogates=1)[0]
from tsdss import block_bootstrap, stationary_bootstrap
# 1. Block Bootstrap
# Fixed block length, suitable for data with strong local dependencies
ts = np.random.normal(0, 1, 1000)
block_samples = block_bootstrap(
data=ts,
block_length=50, # Fixed block length
num_bootstrap=100
)
# 2. Stationary Bootstrap
# Random block length (geometric distribution), preserves stationarity
stat_samples = stationary_bootstrap(
data=ts,
mean_block_length=50, # Average block length
num_bootstrap=100
)
# Compare the two methods
print("Block Bootstrap first sample:", block_samples[0][:10])
print("Stationary Bootstrap first sample:", stat_samples[0][:10])
# Using with pandas Series
ts_series = pd.Series(ts)
block_samples_pd = block_bootstrap(ts_series, block_length=50, num_bootstrap=100)
stat_samples_pd = stationary_bootstrap(ts_series, mean_block_length=50, num_bootstrap=100)
# Key differences:
# 1. Block Bootstrap: Uses fixed block length
# 2. Stationary Bootstrap: Uses random block length (geometric distribution)
# - Better preserves stationarity
# - More suitable for time series with varying dependence structures
from tsdss import (
moving_average_filter,
exponential_filter,
savitzky_golay_filter,
kalman_filter,
butterworth_filter,
median_filter,
multivariate_filter
)
import numpy as np
import matplotlib.pyplot as plt
# 1. Univariate Filtering Example
t = np.linspace(0, 10, 1000)
noisy_signal = np.sin(2*np.pi*0.5*t) + 0.5*np.random.normal(0, 1, 1000)
# Apply different filters
ma_filtered = moving_average_filter(noisy_signal, window_size=5)
ema_filtered = exponential_filter(noisy_signal, alpha=0.3)
sg_filtered = savitzky_golay_filter(noisy_signal, window_size=15, poly_order=3)
kalman_filtered = kalman_filter(noisy_signal, Q=1e-5, R=1e-2)
# 2. Multivariate Filtering Example
# Generate sample multivariate data
mv_data = np.column_stack([
np.sin(2*np.pi*0.5*t) + 0.5*np.random.normal(0, 1, 1000),
np.cos(2*np.pi*0.3*t) + 0.3*np.random.normal(0, 1, 1000),
0.5*t + np.random.normal(0, 0.2, 1000)
])
# Apply multivariate filter
mv_filtered = multivariate_filter(
mv_data,
filter_type='kalman',
Q=1e-5,
R=1e-2
)
# You can also try different filter types
mv_ma = multivariate_filter(mv_data, filter_type='ma', window_size=5)
mv_butter = multivariate_filter(
mv_data,
filter_type='butter',
cutoff=0.1,
fs=100
)
The package uses optimized C++ implementations for core computations:
- Trend decomposition
- Skewness and kurtosis calculation
- ACF computation
- Ljung-Box test
- Python >= 3.7
- NumPy >= 1.19.0
- Pandas >= 1.0.0
- SciPy >= 1.6.0
- Statsmodels >= 0.13.0
- Scikit-learn >= 0.24.0
- Matplotlib >= 3.0.0
Contributions are welcome! Please feel free to submit a Pull Request.
This project is licensed under the MIT License - see the LICENSE file for details.
- Added comprehensive time series filtering functionality
- Added multivariate filtering support
- Improved documentation and examples
- Bug fixes and performance improvements
- Initial release