forked from SteveGreaves/AstroBinUploader
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathAstroBinUpload.py
1058 lines (834 loc) · 45.1 KB
/
AstroBinUpload.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Release history
--------------------------------------------------------------------------------------------
Version 1.0.4
6th December 2023 changes
1.No longer required to manually create configuration csv files.
Checks if csv files exist and creates them if they do not exist.
Default values can still be edited in csv files but also in the configurations dictionary at the start of the code.
Default keywords changed to lower case
To ensure csv edits don't cause issues the data read from files is:
stripped of leading and trailing spaces
keywords are converted to upper case to match FITS header keywords
column names are converted to lower case to ensure code can work with them
corrected data frames are saved back to csv files to format issues are resolved
2. Improved extract header function
converts floats to 4 decimal places
converts dates to format %Y-%m-%d, rounds input to microseconds to ensure conversion works
creates a subset of the header data that matches AstroBin requirements
3.Sites.csv file latitudes and longitudes saved with 4 decimal places but processed to 2 decimal
places to ensure the same site is not recorded multiple times.
4. Corrected issue with Bortle and SQM values not being updated correctly
5. Corrected issue with Keywords from .XISF files not being read correctly
6. Improved code to correct file data reading and saving logic
7. Runtime option to stop program if new csv files are created to edit them.
8. Corrected program logic related to import, access and storage of external parmeters.
9. Refactored code to improve readability
10. Updated docstrings
11. Works with files generated by both Sequence Generator Pro (SGP) and NINA (.FITS, .FIT, .FTS, .XISF)
12. Looks for filter in FITS headers and converts them to 5 digit codes used by AstroBin ( use to be four digit codes)
--------------------------------------------------------------------------------------------
Version 1.0.3
27th November 2023 changes
1. Handles pre and post text spaces in data from csv files
2. Process both FITS and XIFS files or a mixture of both
3. Focal ratio now extracted from header and reported.
4. Exports a session summary report
--------------------------------------------------------------------------------------------
Version 1.0.2
24th November 2023 changes
1. Changes to how the code handles missing Keywords from FITS headers.
2. The code use a defaults.csv to enable the user to configure values for missing keywords.
These default keywords are then applied to all missing header keywords allowing for a more complete upload of information to AstroBin.
The changes attempt to make the code agnostic to the types of FITS headers processed.
3. HFR recovered from the defaults.csv file, instead of a command line entry.
--------------------------------------------------------------------------------------------
Version 1.0.1
23rd November 2023 changes
1. Code checks last LIGHT frame to determine if FITS was generated by NINA
--------------------------------------------------------------------------------------------
Version 1.0.0
23rd November 2023
1. Initial release
--------------------------------------------------------------------------------------------
acqusition.csv uploader see (https://welcome.astrobin.com/importing-acquisitions-from-csv/)
This implementation is not endorsed nor related with AstroBin development team.
Copyright (C) 2023 Steve Greaves
This program is free software: you can redistribute it and/or modify it
under the terms of the GNU General Public License as published by the
Free Software Foundation, version 3 of the License.
This program is distributed in the hope that it will be useful, but WITHOUT
ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
more details.
You should have received a copy of the GNU General Public License along with
this program. If not, see <http://www.gnu.org/licenses/>.
"""
__version__ = "1.0.4"
import pandas as pd
import os
import sys
from astropy.io import fits
import struct
import xml.etree.ElementTree as ET
import requests
import math
import re
from datetime import datetime
import numpy as np
"""
Configuration dictionary used to create the default values csv file. You can edit these values or modify the csv files created to suit your needs.
The 'configurations' dictionary is organized into four key sections:
1. 'defaults': Contains default values for various parameters related to astronomical imaging.
- 'key': Lists the parameters in the header files required by AstroBin, do not change these keys.
- 'value': Provides the default values for each corresponding key, thes can be modified to suit your equipment setup.
- 'comment': Offers a brief description or comment for each parameter.
2. 'filters': Maps your astronomical filters to their respective codes. Ensure the names match those that your image creation package uses and ensure that the Astrobin codes are correct for your filters.
The default codes are for an 2 inch Astronomik LRGB and narrowband set.
- 'filter': Lists the names of the filters (e.g., 'Ha', 'SII', 'OIII', etc.).
- 'code': Provides the corresponding AstroBin code for each filter.
3. 'secret': Stores sensitive information for API access. You will have to edit this section to include your own API key.
- 'api key': The key required for accessing the relevant API.
- 'api endpoint': The URL of the API endpoint.
4. 'sites': Holds information about observation sites. This is updated automatically by the script when a new site is encountered.
- 'latitude', 'longitude': The geographical coordinates of the site.
- 'bortle', 'sqm': Bortle scale classification and Sky Quality Meter reading for the site.
This dictionary is instrumental in initializing, validating, and processing data in the astronomical
data analysis pipeline. It ensures the dat obtaine dcan be uploaded sucessfully to AstroBin.
"""
configurations = {
'defaults': {
'key': ['IMAGETYP','EXPOSURE', 'DATE-LOC', 'XBINNING', 'GAIN', 'XPIXSZ', 'CCD-TEMP', 'FOCALLEN', 'FOCRATIO', 'SITELAT', 'SITELONG', 'FILTER', 'OBJECT', 'FOCTEMP', 'SWCREATE','HFR'],
'value': ['LIGHT','100', '2023-01-01', '1', '0', '1', '-10', '540', '5.4', '52.25', '-0.12', 'No Filter', 'No target', '20','Unknown package', '1.6'],
'comment': ['Exposure type','Exposure time in seconds', 'Observation date', 'Camera binning', 'Camera gain', 'Camera pixel size in um', 'Camera sensor temperature in degrees C', 'Telescope focal length in mm', 'Telescope focal ratio', 'Observation site latitude in decimal degrees', 'Observation site longitude in decimal degrees', 'Filter name', 'Target name', 'Ambient temperature in degrees C as measure by the focuser','Creation package', 'Half-flux radius in pixels']
},
'filters': {
'filter': ['Ha', 'SII', 'OIII', 'Red', 'Green', 'Blue', 'Lum', 'CLS'],
'code': [4663, 4844, 4752, 4649, 4643, 4637, 2906, 4061]
},
'secret': {
'api key': 'xxxxxxxxxx', # enter you API key here
'api endpoint': 'https://www.lightpollutionmap.info/QueryRaster/'
},
'sites': {
'latitude': '',
'longitude': '',
'bortle': '',
'sqm': ''
}
}
def read_or_create_csv(dictionaries):
"""
Reads from or creates CSV files based on the input dictionaries.
This function iterates over each dictionary provided in the input 'dictionaries'.
For each dictionary, the function checks if a corresponding CSV file (named after the dictionary) exists.
If the file exists, it reads the CSV file into a pandas DataFrame.
If the file does not exist, it creates a new CSV file from the dictionary data, ensuring to:
- Convert scalar values to single-item lists.
- Replace NaN values with an empty string.
It also strips whitespaces from string columns and converts keys to lower case in the DataFrame and
checks that the defaults 'key' column is upper case to match the header files key words.
After processing, it saves any changes to the CSV file and updates the DataFrame in the output dictionary.
Parameters:
- dictionaries (dict): A dictionary where keys are the names for the CSV files to be read or created,
and values are dictionaries containing the data for the corresponding CSV file.
Returns:
- tuple:
- A dictionary of DataFrames corresponding to each input dictionary.
- A boolean flag indicating whether any new CSV file was created during the function's execution.
"""
# [Function implementation]
dataframes = {}
file_created = False # Flag to track if any file is created
# Iterate over each dictionary in the input
for dictionary_name, dictionary_data in dictionaries.items():
csv_file = f"{dictionary_name}.csv"
# Convert scalar values to single-item lists
for key, value in dictionary_data.items():
if not isinstance(value, list):
dictionary_data[key] = [value]
# Check if the CSV file already exists
if os.path.exists(csv_file):
# If it exists, read the DataFrame from the CSV file
df = pd.read_csv(csv_file)
print('Reading', csv_file)
else:
# If it doesn't exist, create a new DataFrame from the dictionary data
df = pd.DataFrame(dictionary_data)
print(f"File '{csv_file}' was missing, so it was created.")
file_created = True # Set the flag to True as a file was created
# Replace NaN values with an empty string
df = df.fillna('')
# Save the DataFrame to the CSV file
df.to_csv(csv_file, index=False)
# Strip whitespaces from string columns (object dtype)
df = df.apply(lambda x: x.str.strip() if x.dtype == "object" else x)
# Strip whitespaces from column names and make them lower case
df.columns = df.columns.str.strip().str.lower()
if dictionary_name == 'defaults':
# Ensure the 'key' column is upper case to match the header files key words
df['key'] = df['key'].str.upper()
if dictionary_name == 'sites':
# Ensure the 'key' column is upper case to match the header files key words
df = df.replace('', np.nan)
# Check if any changes were made (i.e., if there were any spaces stripped)
if df.to_csv(index=False) != pd.read_csv(csv_file).to_csv(index=False):
# Save the DataFrame back to the CSV file to correct errored input data
df.to_csv(csv_file, index=False)
dataframes[dictionary_name] = df
return dataframes, file_created
def read_xisf_header(file_path):
"""
Reads the header of an XISF file.
Opens and reads the XISF file specified by 'file_path'. It checks for the XISF signature,
reads the header, and returns it as a string. If the file is not a valid XISF file or
an error occurs, it returns None.
Parameters:
- file_path (str): The path to the XISF file.
Returns:
- str or None: The XISF header as a string, or None if the file is invalid or an error occurs.
"""
# Function implementation
try:
with open(file_path, 'rb') as file:
signature = file.read(8).decode('ascii')
# Check for the XISF signature
if signature != 'XISF0100':
print("Invalid file format")
return None
# Read and skip header length and reserved field
header_length = struct.unpack('<I', file.read(4))[0]
file.read(4) # Skip reserved field
xisf_header = file.read(header_length).decode('utf-8')
return xisf_header
except Exception as e:
print(f"Error: {e}")
return None
def xml_to_data(xml_data):
"""
Converts XML data to a dictionary.
Parses XML data, specifically extracting 'FITSKeyword' tags, and converts them into
a dictionary with 'name' as keys and 'value' as values.
Parameters:
- xml_data (str): A string containing XML data.
Returns:
- dict: A dictionary containing data extracted from XML.
"""
# Function implementation
# Register the namespace
ns = {'xisf': 'http://www.pixinsight.com/xisf'}
ET.register_namespace('', ns['xisf'])
# Parse the XML data
root = ET.fromstring(xml_data)
# Create a list to store our data
data = {}
# Iterate through each 'FITSKeyword' tag in the XML
for fits_keyword in root.findall('.//xisf:FITSKeyword', namespaces=ns):
#print(fits_keyword)
name = fits_keyword.get('name')
value = fits_keyword.get('value')
# Add the 'name' and 'value' to the dictionary
data[name] = value
# Convert the list to a DataFrame
#df = pd.DataFrame(data)
return data
def sync_headers(default_header, fits_header):
"""
Synchronizes FITS headers with default headers.
Takes two dictionaries, 'default_header' and 'fits_header'. It ensures that the keys
in 'fits_header' match those in 'default_header', filling in missing values from
'default_header' as needed.
Parameters:
- default_header (dict): A dictionary containing default header values.
- fits_header (dict): A dictionary containing FITS header values.
Returns:
- dict: A dictionary representing the synchronized FITS header.
"""
# Function implementation
# Initialize an empty dictionary
updated_fits_header = {}
# Add entries from fits_header that also exist in default_header
for k, v in fits_header.items():
if k in default_header['value'].index:
updated_fits_header[k] = v
# Add entries from default_header that don't exist in updated_fits_header
for k, v in default_header['value'].items():
if k not in updated_fits_header:
updated_fits_header[k] = v
return updated_fits_header
def try_parse_date(s):
'''
Attempts to parse a string as a date in the format '%Y-%m-%dT%H:%M:%S.%f' and returns it as '%d-%m-%Y'.
If it fails, it raises a ValueError.
Parameters:
s (str): The string to parse.
Returns:
str: The parsed date string in the format '%Y-%m-%d'.
Raises:
ValueError: If the string cannot be parsed as a date.
'''
try:
# Truncate to microsecond precision
s = s[:26]
return datetime.strptime(s, '%Y-%m-%dT%H:%M:%S.%f').strftime('%Y-%m-%d')
except ValueError:
raise ValueError("Could not parse date")
def dms_to_decimal(dms_str):
'''
Converts a string in the format 'degrees minutes seconds' to a decimal degree value.
Parameters:
dms_str (str): The string to convert.
Returns:
float: The converted decimal degree value.
'''
match = re.match(r"([+-]?\d+)\s+(\d+)\s+([\d\.]+)", dms_str)
if match:
degrees, minutes, seconds = map(float, match.groups())
return round(abs(degrees) + minutes / 60 + seconds / 3600, 4) * (1 if degrees > 0 else -1)
else:
return dms_str
def round_floats_and_convert_datetime_in_dict(d):
'''
Iterates over a dictionary, rounds float values to 4 decimal places, and converts datetime strings to the format '%d-%m-%Y'.
Parameters:
d (dict): The dictionary to process.
Returns:
dict: The processed dictionary with rounded float values and converted datetime strings.
'''
for key, value in d.items():
try:
# Try to convert to float and round
d[key] = round(float(value), 4)
except ValueError:
# If it's not a float, try to convert it to a date
if isinstance(value, str):
try:
d[key] = try_parse_date(value)
except ValueError:
# If it's not a date, try to convert it from DMS to decimal degrees
d[key] = dms_to_decimal(value)
return d
def extract_headers(directories, default_values):
"""
Extracts headers from FITS and XISF files in the specified directories.
Parameters:
- directories (list): A list of directories to search for files.
- default_values (DataFrame): A DataFrame containing default header values.
Returns:
- list: A list of dictionaries representing the processed headers.
"""
# Start of function extract_headers
headers = [] # List to store the processed headers
# Set 'key' as the index in default_values DataFrame if not already set
try:
default_values.set_index('key', inplace=True)
except Exception as e:
pass
#print(f"Index setting issue: {e}")
# Convert default_values DataFrame to a dictionary for easy lookup
default_header = dict(default_values)
# Iterate over each directory provided
for directory in directories:
# Walk through the directory
for root, _, files in os.walk(directory):
print(f"Extracting headers from directory: {root}")
# Process each file in the directory
for file in files:
file_path = os.path.join(root, file) # Full path of the file
# Check file extension and process accordingly
if file.lower().endswith(('.fits', '.fit', '.fts', '.xisf')):
try:
if file.lower().endswith(('.fits','.fit','.fts')):
# Open FITS file and extract header
with fits.open(file_path) as hdul:
header = hdul[0].header
elif file.lower().endswith('.xisf'):
# Read and parse XISF file header
header_xml = read_xisf_header(file_path)
header = xml_to_data(header_xml)
# Convert header to dictionary and process
header_dict = dict(header)
# Synchronize header with default values and add file name
reduced_header_dict = sync_headers(default_header, header_dict)
# Round floats and converts datetime string in dictionary
reduced_header_dict = round_floats_and_convert_datetime_in_dict(reduced_header_dict)
reduced_header_dict['FILENAME'] = os.path.basename(file_path)
# Accumulate processed headers
headers.append(reduced_header_dict)
except Exception as e:
print(f"Error reading {file_path}: {e}")
# Extract software capture information from the last processed header
swcreate = header.get('CREATOR', header.get('SWCREATE', 'unknown package')) if header else 'unknown package'
print(f"\nImages captured by {swcreate}")
return pd.DataFrame(headers) # Return list of dataFrames representing the processed headers
def format_seconds_to_hms(seconds):
"""
Formats a time duration given in seconds into a human-readable format (hours, minutes, seconds).
Converts a duration in seconds to a string format, expressing the duration in hours, minutes,
and seconds. For example, 3661 seconds would be converted to '1 hrs 1 mins 1 secs'.
Parameters:
- seconds (int or float): The time duration in seconds.
Returns:
- str: The formatted time string.
"""
# Function implementation
# Divide the total seconds into hours and remainder seconds
hours, remainder = divmod(seconds, 3600)
# Further divide the remainder into minutes and seconds
minutes, seconds = divmod(remainder, 60)
# Initialize an empty list to hold time parts
time_parts = []
# Append hours to time_parts if hours is greater than 0
if hours > 0:
time_parts.append(f"{hours} hrs")
# Append minutes to time_parts if minutes is greater than 0
if minutes > 0:
time_parts.append(f"{minutes} mins")
# Append seconds to time_parts
time_parts.append(f"{seconds:.0F} secs")
# Join the time parts with spaces and return the formatted string
return ' '.join(time_parts)
def summarize_session(df):
#main code for summarize_session
summary = ""
# Check if 'IMAGETYP' column exists in the DataFrame
if 'IMAGETYP' in df:
txt = "\nObservation session Summary:\n"
summary += txt
# Process different frame types: LIGHT, FLAT, BIAS, and DARK
for imagetyp in ['LIGHT', 'FLAT', 'BIAS', 'DARK']:
if imagetyp in df['IMAGETYP'].values:
group = df[df['IMAGETYP'] == imagetyp]
# Process LIGHT and FLAT frames
if imagetyp in ['LIGHT', 'FLAT']:
txt = f"\n{imagetyp}S:\n"
summary += txt
# Group by FILTER and summarize
for filter_type, group_df in group.groupby('FILTER'):
frame_count = group_df.shape[0]
total_exposure = group_df['EXPOSURE'].astype(float).sum()
formatted_time = format_seconds_to_hms(total_exposure)
txt = f"\n Filter {filter_type}:\t {frame_count} frames, Exposure time: {formatted_time}"
summary += txt
summary += '\n'
# Process BIAS and DARK frames, grouped by GAIN
elif imagetyp in ['BIAS', 'DARK']:
for gain_value, gain_group in group.groupby('GAIN'):
frame_count = gain_group.shape[0]
total_exposure = gain_group['EXPOSURE'].astype(float).sum()
formatted_time = format_seconds_to_hms(total_exposure)
txt = f"\n{imagetyp} with GAIN {gain_value}:\t {frame_count} frames, Exposure time: {formatted_time}"
summary += txt
# Additional summary for LIGHT frames
if imagetyp == 'LIGHT':
total_light_exposure = group['EXPOSURE'].astype(float).sum()
formatted_total_light_time = format_seconds_to_hms(total_light_exposure)
txt = f"\nTotal session exposure for LIGHTs:\t {formatted_total_light_time}\n"
summary += txt
else:
# Handling case where 'IMAGETYP' column is not present
txt = "No 'IMAGETYP' column found in headers."
summary += txt
return summary
def create_calibration_df(df):
"""
Generates a DataFrame summarizing calibration frame data from a given DataFrame based on
specific image types (IMAGETYP), GAIN values, and FILTER values where applicable.
This function filters the input DataFrame for relevant calibration frame types (DARK, BIAS,
FLAT, FLATDARKS), then groups the data by these types along with GAIN, and FILTER (for FLAT frames).
It provides a count of each group, which is useful for assessing the calibration data available.
Parameters:
- df (pandas.DataFrame): The DataFrame containing FITS header data.
Returns:
- pandas.DataFrame: A DataFrame with columns 'TYPE', 'GAIN', 'FILTER' (if applicable),
and 'NUMBER' representing the count of each group.
"""
# Define relevant frame types for calibration
relevant_types = ['DARK', 'BIAS', 'FLAT', 'FLATDARKS']
# Filter the DataFrame for relevant frame types
filtered_df = df[df['IMAGETYP'].isin(relevant_types)].copy()
# Group by IMAGETYP and GAIN, and additionally by FILTER for FLAT frames
if 'FILTER' in df.columns:
# Set FILTER to empty string for non-FLAT frames
filtered_df.loc[filtered_df['IMAGETYP'] != 'FLAT', 'FILTER'] = ''
# Group by TYPE, GAIN, and FILTER, and count the number of frames
group_counts = filtered_df.groupby(['IMAGETYP', 'GAIN', 'FILTER']).size().reset_index(name='NUMBER')
else:
# Group by TYPE and GAIN if FILTER column doesn't exist, and count the number of frames
group_counts = filtered_df.groupby(['IMAGETYP', 'GAIN']).size().reset_index(name='NUMBER')
# Rename 'IMAGETYP' column to 'TYPE'
return group_counts.rename(columns={'IMAGETYP': 'TYPE'})
def create_lights_df(df: pd.DataFrame)-> pd.DataFrame:
"""
Creates a DataFrame for 'LIGHT' type data
Args:
df (pd.DataFrame): DataFrame containing FITS header data.
Returns:
pd.DataFrame: Aggregated DataFrame with 'LIGHT' type data.
"""
# Filter the DataFrame for rows where the image type is 'LIGHT'
light_df = df[df['IMAGETYP'] == 'LIGHT'].copy()
# Return the DataFrame with 'LIGHT' type data
return pd.DataFrame(light_df)
def sqm_to_bortle(sqm):
"""
Converts an SQM (Sky Quality Meter) value to the corresponding Bortle scale classification.
The Bortle scale is a nine-level numeric scale used to quantify the astronomical observability of celestial objects,
affected by light pollution. The scale ranges from 1, indicating the darkest skies, to 9, the brightest.
Args:
sqm (float): The SQM value indicating the level of light pollution.
Returns:
int: The Bortle scale classification (ranging from 1 to 9).
"""
# Bortle scale classification based on SQM values
if sqm > 21.99:
return 1 # Class 1: Excellent dark-sky site
elif 21.50 <= sqm <= 21.99:
return 2 # Class 2: Typical truly dark site
elif 21.25 <= sqm <= 21.49:
return 3 # Class 3: Rural sky
elif 20.50 <= sqm <= 21.24:
return 4 # Class 4: Rural/suburban transition
elif 19.50 <= sqm <= 20.49:
return 5 # Class 5: Suburban sky
elif 18.50 <= sqm <= 19.49:
return 6 # Class 6: Bright suburban sky
elif 17.50 <= sqm <= 18.49:
return 7 # Class 7: Suburban/urban transition
elif 17.00 <= sqm <= 17.49:
return 8 # Class 8: City sky
else:
return 9 # Class 9: Inner-city sky
def get_bortle_sqm(lat: float, lon:float, secret_df):
"""
Retrieves the Bortle scale classification and SQM (Sky Quality Meter) value for a given latitude and longitude.
Parameters:
- lat (float): The latitude coordinate.
- lon (float): The longitude coordinate.
- secret_df (pandas.DataFrame): A DataFrame containing the API key and endpoint.
Returns:
- tuple: A tuple containing the Bortle scale classification, SQM value, error message (if any),
and flags indicating the validity of the API key and endpoint.
"""
# Function implementation
def is_valid_api_key(api_key):
""" Check if the API key is valid. """
return api_key is not None and len(api_key) == 16 and api_key.isalnum()
def is_valid_api_endpoint(api_endpoint):
""" Check if the API endpoint is valid. """
return bool(api_endpoint and api_endpoint.strip())
if secret_df.empty or secret_df.isna().values.any():
return 0, 0, "api_key and/or api_endpoint are empty", False, False
# Extract the API key and endpoint from the DataFrame
api_key = secret_df.get('api key', pd.Series([None])).iloc[0].strip() if isinstance(secret_df.get('api key', pd.Series([None])).iloc[0], str) else None
api_endpoint = secret_df.get('api endpoint', pd.Series([None])).iloc[0].strip() if isinstance(secret_df.get('api endpoint', pd.Series([None])).iloc[0], str) else None
api_valid = is_valid_api_key(api_key)
api_endpoint_valid = is_valid_api_endpoint(api_endpoint)
# Check the validity of the API key and endpoint
if not api_valid and not api_endpoint_valid:
return 0, 0, "Both API key and API endpoint are invalid.", api_valid, api_endpoint_valid
elif not api_valid:
return 0, 0, "API key is malformed.", api_valid, api_endpoint_valid
elif not api_endpoint_valid:
return 0, 0, "API endpoint is empty.", api_valid, api_endpoint_valid
# Define the parameters for the API request
params = {
'ql': 'wa_2015',
'qt': 'point',
'qd': f'{lon},{lat}',
'key': api_key
}
try:
response = requests.get(api_endpoint, params=params)
response.raise_for_status()
if response.text.strip() == 'Invalid authentication.':
return 0, 0, "Authentication error: Missing or invalid API key.", False, api_endpoint_valid
artificial_brightness = float(response.text)
sqm = (math.log10((artificial_brightness + 0.171168465)/108000000)/-0.4)
bortle_class = sqm_to_bortle(sqm)
return bortle_class, round(sqm, 2), None, api_valid, api_endpoint_valid
except requests.exceptions.HTTPError as err:
return 0, 0, f"HTTP Error: {err}", api_valid, False
except ValueError:
return 0, 0, "Could not convert response to float.", api_valid, api_endpoint_valid
except Exception as e:
return 0, 0, f"An error occurred: {e}", api_valid, api_endpoint_valid
def calculate_auxiliary_parameters(df, defaults_df, secret_df, sites_df):
"""
Calculates auxiliary parameters for a DataFrame containing FITS header data.
This function calculates and adds auxiliary parameters to the DataFrame, including:
- BORTLE: Bortle scale classification for the observation site.
- SQM: Sky Quality Meter reading for the observation site.
- HFR: Half-flux radius in pixels.
- IMSCALE: Image scale in arcseconds per pixel.
- FWHM: Full-width at half-maximum in arcseconds.
Parameters:
- df (pandas.DataFrame): A DataFrame containing FITS header data.
- defaults_df (pandas.DataFrame): A DataFrame containing default header values.
- secret_df (pandas.DataFrame): A DataFrame containing API key and endpoint.
- sites_df (pandas.DataFrame): A DataFrame containing observation site data.
Returns:
- pandas.DataFrame: The DataFrame with auxiliary parameters added.
"""
#main code for calculate_auxiliary_parameters
# Convert Lat and Long to float and round to 2 decimal places
#df['SITELAT'] = df['SITELAT'].astype(float).round(2)
#df['SITELONG'] = df['SITELONG'].astype(float).round(2)
bortle, sqm, api_response_text, valid_api_key, valid_api_endpoint = get_bortle_sqm('0.0', '54.0',secret_df)
#check if sites_df is empty
empty_sites = sites_df.empty or (sites_df.values == '').any() or sites_df.isna().values.any()
# Extract default HFR value from defaults DataFrame
hfr_set = defaults_df.loc['HFR', 'value'].strip()
# Set to keep track of processed latitude-longitude pairs
processed_sites = set()
# Iterate over each row in the DataFrame
for index, row in df.iterrows():
lat, lon = row['SITELAT'],row['SITELONG']
latr, lonr = round(lat,2),round(lon,2)
# Checking for existing site data in sites_df
site_data = ((round(sites_df['latitude'],2) == latr) & (round(sites_df['longitude'],2) == lonr)).any()
#check lat and long dont exist in processd sites
coordinates_processed = (latr, lonr) in processed_sites
if not site_data:
if valid_api_key:
# Fetch Bortle and SQM from API if
# API key has valid form
# and
# (latitude and longtitude not found in sites_df, ie new site
bortle, sqm, api_response_text, valid_api_key, valid_api_endpoint = get_bortle_sqm(latr, lonr,secret_df)
#check if get_bortle_sqm returned valid values
if api_response_text is not None:
msg = f"\nAPI request failed for lat {lat}, lon {lon}: Using 0 for Bortle and SQM. "
bortle, sqm = 0, 0
else:
# Adding new site data to sites_df and save it
new_site = {'latitude': lat, 'longitude': lon, 'bortle': bortle, 'sqm': sqm}
new_site_df = pd.DataFrame([new_site])
if empty_sites:
sites_df = new_site_df
else:
sites_df = pd.concat([sites_df, new_site_df], ignore_index=True, sort=False)
sites_df.to_csv('sites.csv', index=False)
msg = f"\nRetrieved bortle {bortle} and sqm {sqm} for lat {lat}, lon {lon} from api endpoint"
else:
msg = f"\nlat {lat}, lon {lon} not in sites.csv and invalid api key: using 0 for bortle and sqm."
bortle, sqm = 0, 0
else:
bortle, sqm = sites_df.iloc[0]['bortle'], sites_df.iloc[0]['sqm']
msg = f"\nRetrieved Bortle {bortle} and SQM {sqm} for lat {lat}, lon {lon} from sites.csv"
if not (latr,lonr) in processed_sites:
processed_sites.add((latr, lonr)) # Mark as processed
print(msg)
# Update the DataFrame with Bortle and SQM values
df.at[index, 'BORTLE'] = bortle
df.at[index, 'SQM'] = sqm
# Calculate and update HFR, IMSCALE, and FWHM values
file_path = row['FILENAME']
hfr_match = re.search(r'HFR_([0-9.]+)', file_path)
hfr = float(hfr_match.group(1)) if hfr_match and float(hfr_match.group(1)) > 0 else float(hfr_set)
imscale = float(row['XPIXSZ']) / float(row['FOCALLEN']) * 206.265
fwhm = hfr * imscale if hfr >= 0.0 else 0.0
df.at[index, 'HFR'] = round(hfr,2)
df.at[index, 'IMSCALE'] = round(imscale,2)
df.at[index, 'FWHM'] = round(fwhm,2)
print('\nCompleted sky quality extraction')
return df
# Function to retrieve calibration data for a given row
def get_calibration_data(row: pd.Series, cal_type: str, calibration_df: pd.DataFrame) -> int:
"""
Retrieves the count of calibration frames for a given row based on specified calibration type.
This nested function matches a row from the aggregated DataFrame with the calibration DataFrame
based on the calibration type (e.g., FLAT, DARK, BIAS, FLATDARKS) and other parameters like 'GAIN'
and 'FILTER'. It returns the sum of 'NUMBER' of matched calibration frames.
Parameters:
- row (pd.Series): A series representing a row in the aggregated DataFrame.
- cal_type (str): The type of calibration data to match (e.g., 'FLAT', 'DARK').
- calibration_df (pandas.DataFrame): The DataFrame containing calibration frame data.
Returns:
- int: The total count of matching calibration frames.
"""
# Function implementation
if cal_type == 'FLAT':
# Matching both 'GAIN' and 'FILTER' for FLAT type
match = calibration_df[(calibration_df['TYPE'] == cal_type) &
(calibration_df['GAIN'] == row['gain']) &
(calibration_df['FILTER'].str.upper() == row['filter'].upper())]
else:
# Matching 'GAIN' for other types
match = calibration_df[(calibration_df['TYPE'] == cal_type) &
(calibration_df['GAIN'] == row['gain'])]
return match['NUMBER'].sum() if not match.empty else 0
def aggregate_parameters(lights_df, calibration_df):
"""
Aggregates astronomical observation parameters from light frames and calibration data.
This function processes a DataFrame of light frame data ('lights_df') and a DataFrame of calibration
data ('calibration_df'). It standardizes column names and formats in 'lights_df', aggregates data
by specific parameters (date, filter, gain, binning, and exposure), and adds calibration data counts
(for darks, flats, bias, and flat darks) from 'calibration_df'. The function also includes Bortle scale
and SQM values, and calculates the mean FWHM, sensor cooling, and temperature for each group.
Parameters:
- lights_df (pandas.DataFrame): A DataFrame containing data from light frames, with columns like
'date-loc', 'filter', 'gain', 'xbinning', 'exposure', etc.
- calibration_df (pandas.DataFrame): A DataFrame containing calibration frame data, with columns
like 'TYPE', 'GAIN', 'FILTER', 'NUMBER', etc.
Returns:
- pandas.DataFrame: An aggregated DataFrame with detailed information for each set of grouped parameters.
"""
# Standardizing column names to lower case and converting 'date-loc' to date format
lights_df.columns = lights_df.columns.str.lower()
lights_df['date-loc'] = pd.to_datetime(lights_df['date-loc']).dt.date
lights_df['ccd-temp'] = lights_df['ccd-temp'].astype(float).round(0)
lights_df['foctemp'] = lights_df['foctemp'].astype(float).round(2)
# Aggregating data by date, filter, gain, xbinning, and exposure
aggregated_df = lights_df.groupby(['date-loc', 'filter', 'gain', 'xbinning', 'exposure']).agg(
number=('date-loc', 'count'),
sensorCooling=('ccd-temp', 'mean'),
temperature=('foctemp', 'mean'),
meanFwhm=('fwhm', 'mean')
).reset_index()
# Renaming columns for clarity
aggregated_df.rename(columns={
'xbinning': 'binning',
'exposure': 'duration',
'focratio': 'fnumber'}, inplace=True)
# Applying get_calibration_data to aggregate calibration data
aggregated_df['darks'] = aggregated_df.apply(get_calibration_data, args=('DARK', calibration_df), axis=1)
aggregated_df['flats'] = aggregated_df.apply(get_calibration_data, args=('FLAT', calibration_df), axis=1)
aggregated_df['bias'] = aggregated_df.apply(get_calibration_data, args=('BIAS', calibration_df), axis=1)
aggregated_df['flatDarks'] = aggregated_df.apply(get_calibration_data, args=('FLATDARKS', calibration_df), axis=1)
# Adding Bortle scale and SQM values
aggregated_df['bortle'] = lights_df['bortle'].round(2)
aggregated_df['meanSqm'] = lights_df['sqm'].round(2)
# Adding fNumber, and rounding sensor cooling and temperature
aggregated_df['fNumber'] = lights_df['focratio']
aggregated_df['sensorCooling'] = aggregated_df['sensorCooling'].round().astype(int)
aggregated_df['temperature'] = aggregated_df['temperature'].astype(float).round(2)
aggregated_df['meanFwhm'] = aggregated_df['meanFwhm'].astype(float).round(2)
return aggregated_df
def update_filter(filter_value, filter_to_code):
"""
Maps a filter name to its corresponding code based on the 'filter_to_code' dictionary.
If the filter name exists in 'filter_to_code', this function returns the associated code.
If the code is not a four-digit integer, an error message is printed, and a placeholder
indicating no code found is returned.
Parameters:
- filter_value (str): The name of the filter to be mapped to a code.
- filter_to_code (dict): A dictionary mapping filter names to their corresponding codes.
Returns:
- int or str: The code corresponding to the filter name, or an error message if no valid code is found.
"""
# Function implementation
#check code is a five digit integer, if it is assume valid code
code = filter_to_code.get(filter_value)
# Try to convert the code to an integer
try:
code = int(code)
except (ValueError, TypeError):
code = None
#Thanks to Francisco Bitto for flagging this.
if isinstance(code, int) and 1000 <= code <= 99999:
return code
else:
# Print an error message if the code is not found for a filter
print(f"\nWarning: for filter {filter_value}: no code found. Enter a valid code in filters.csv file or input the code in the astrobin upload file.")
return f"{filter_value}: no code found"
def create_astrobin_output(df, filter_df):
"""
Transforms a DataFrame into a format suitable for AstroBin output.
This function takes a DataFrame containing astronomical observation data and a DataFrame mapping
filter names to codes. It updates the 'filter' column in the observation DataFrame to use filter codes
instead of names. The function also reorders and renames columns to match the expected format for AstroBin,
a platform for sharing astrophotography. It ensures the DataFrame columns align with AstroBin's data
requirements, including the transformation of filter names into corresponding codes.
Parameters:
- df (pandas.DataFrame): A DataFrame containing observation data.
- filter_df (pandas.DataFrame): A DataFrame mapping filter names to their corresponding codes.
Returns:
- pandas.DataFrame: The transformed DataFrame with columns renamed and reordered to match AstroBin's format.
"""
# Mapping filter name to filter code
filter_to_code = filter_df.set_index('filter')['code']
# Finding the position of the 'filter' column in df
filter_col_index = df.columns.get_loc('filter')
# Apply the update_filter function to the 'filter' column if the contents are strings
if df['filter'].dtype == 'object':
df['filter'] = df['filter'].apply(lambda x: update_filter(x, filter_to_code))
# Reordering columns to match AstroBin's expected format
column_order = ['date', 'filter', 'number', 'duration', 'binning', 'gain',
'sensorCooling', 'fNumber', 'darks', 'flats', 'flatDarks', 'bias', 'bortle',
'meanSqm', 'meanFwhm', 'temperature']
# Renaming columns to match AstroBin's expected format
df.rename(columns={'date-loc': 'date'}, inplace=True)
# Return the transformed DataFrame with rounded values and reordered columns
return df[column_order]
def main():
"""
Main function to process astronomical observation data for analysis and AstroBin output.
This function performs several steps to process astronomical observation data:
1. Reads or creates configuration CSV files.
2. Validates directory paths provided via command line arguments.
3. Extracts FITS headers from files in the given directories.
4. Creates and prints a summary of the observation session.
5. Creates DataFrame for calibration data.
6. Creates DataFrame for light frame data.
7. Calculates auxiliary parameters like Bortle scale and SQM.
8. Aggregates parameters for analysis.
9. Transforms data into a format suitable for AstroBin output.
The function also exports the session summary and final data to text and CSV files, respectively.
It uses several configuration files (defaults, filters, secrets, sites) to aid in processing.
If new configuration files are created, the user is given the option to terminate the program
to edit these files before proceeding.
"""
# Implementation of the main function
# Step 1: Read or create configuration CSV files
params, file_created = read_or_create_csv(configurations)
# Check if any configuration file was created
if file_created:
# Give the user the option to terminate the program
user_input = input("New configuration files were created. Do you wish to edit them before continuing? (y/n): ")
if user_input.lower() == 'y':
print("Exiting the program. Please edit the configuration files as needed and rerun the script.")