Core Functions
API for core operations.
pyngb.read_ngb(path, *, return_metadata=False, baseline_file=None, dynamic_axis='sample_temperature')
read_ngb(
path: str,
*,
return_metadata: Literal[False] = False,
baseline_file: None = None,
dynamic_axis: str = "time",
) -> pa.Table
read_ngb(
path: str,
*,
return_metadata: Literal[True],
baseline_file: None = None,
dynamic_axis: str = "time",
) -> tuple[FileMetadata, pa.Table]
Read NETZSCH NGB file data with optional baseline subtraction.
This is the primary function for loading NGB files. By default, it returns a PyArrow table with embedded metadata. For direct metadata access, use return_metadata=True. When baseline_file is provided, baseline subtraction is performed automatically.
Parameters
path : str Path to the NGB file (.ngb-ss3 or similar extension). Supports absolute and relative paths. return_metadata : bool, default False If False (default), return PyArrow table with embedded metadata. If True, return (metadata, data) tuple. baseline_file : str or None, default None Path to baseline file (.ngb-bs3) for baseline subtraction. If provided, performs automatic baseline subtraction. The baseline file must have an identical temperature program to the sample file. dynamic_axis : str, default "sample_temperature" Axis to use for dynamic segment alignment in baseline subtraction. Options: "time", "sample_temperature", "furnace_temperature"
Returns
pa.Table or tuple[FileMetadata, pa.Table] - If return_metadata=False: PyArrow table with embedded metadata - If return_metadata=True: (metadata dict, PyArrow table) tuple - If baseline_file provided: baseline-subtracted data
Raises
FileNotFoundError If the specified file does not exist NGBStreamNotFoundError If required data streams are missing from the NGB file NGBCorruptedFileError If the file structure is invalid or corrupted zipfile.BadZipFile If the file is not a valid ZIP archive
Examples
Basic usage (recommended for most users):
from pyngb import read_ngb import polars as pl
Load NGB file
data = read_ngb("experiment.ngb-ss3")
Convert to DataFrame for analysis
df = pl.from_arrow(data) print(f"Shape: {df.height} rows x {df.width} columns") Shape: 2500 rows x 8 columns
Access embedded metadata
import json metadata = json.loads(data.schema.metadata[b'file_metadata']) print(f"Sample: {metadata['sample_name']}") print(f"Instrument: {metadata['instrument']}") Sample: Polymer Sample A Instrument: NETZSCH STA 449 F3 Jupiter
Advanced usage (for metadata-heavy workflows):
Get metadata and data separately
metadata, data = read_ngb("experiment.ngb-ss3", return_metadata=True)
Work with metadata directly
print(f"Operator: {metadata.get('operator', 'Unknown')}") print(f"Sample mass: {metadata.get('sample_mass', 0)} mg") print(f"Data points: {data.num_rows}") Operator: Jane Smith Sample mass: 15.2 mg Data points: 2500
Use metadata for data processing
df = pl.from_arrow(data) initial_mass = metadata['sample_mass'] df = df.with_columns( ... (pl.col('mass') / initial_mass * 100).alias('mass_percent') ... )
Data analysis workflow:
Simple analysis
data = read_ngb("sample.ngb-ss3") df = pl.from_arrow(data)
Basic statistics
if "sample_temperature" in df.columns: ... temp_range = df["sample_temperature"].min(), df["sample_temperature"].max() ... print(f"Temperature range: {temp_range[0]:.1f} to {temp_range[1]:.1f} °C") Temperature range: 25.0 to 800.0 °C
Mass loss calculation
if "mass" in df.columns: ... mass_loss = (df["mass"].max() - df["mass"].min()) / df["mass"].max() * 100 ... print(f"Mass loss: {mass_loss:.2f}%") Mass loss: 12.3%
Performance Notes
- Fast binary parsing with NumPy optimization
- Memory-efficient processing with PyArrow
- Typical parsing time: 0.1-10 seconds depending on file size
- Includes file hash for integrity verification
See Also
NGBParser : Low-level parser for custom processing BatchProcessor : Process multiple files efficiently
Source code in src/pyngb/api/loaders.py
59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 |
|
pyngb.NGBParser
Main parser for NETZSCH STA NGB files with enhanced error handling.
This is the primary interface for parsing NETZSCH NGB files. It orchestrates the parsing of metadata and measurement data from the various streams within an NGB file.
The parser handles the complete workflow: 1. Opens and validates the NGB ZIP archive 2. Extracts metadata from stream_1.table 3. Processes measurement data from stream_2.table and stream_3.table 4. Returns structured data with embedded metadata
Example
parser = NGBParser() metadata, data_table = parser.parse("sample.ngb-ss3") print(f"Sample: {metadata.get('sample_name', 'Unknown')}") print(f"Data shape: {data_table.num_rows} x {data_table.num_columns}") Sample: Test Sample 1 Data shape: 2500 x 8
Advanced Configuration
config = PatternConfig() config.column_map["custom_id"] = "custom_column" parser = NGBParser(config)
Attributes:
Name | Type | Description |
---|---|---|
config |
Pattern configuration for parsing |
|
markers |
Binary markers for data identification |
|
binary_parser |
Low-level binary parsing engine |
|
metadata_extractor |
Metadata extraction engine |
|
data_processor |
Data stream processing engine |
Thread Safety
This parser is not thread-safe. Create separate instances for concurrent parsing operations.
Source code in src/pyngb/core/parser.py
25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 |
|
Functions
validate_ngb_structure(zip_file)
Validate that the ZIP file has the expected NGB structure.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
zip_file
|
ZipFile
|
Open ZIP file to validate |
required |
Returns:
Type | Description |
---|---|
list[str]
|
List of available streams |
Raises:
Type | Description |
---|---|
NGBStreamNotFoundError
|
If required streams are missing |
Source code in src/pyngb/core/parser.py
parse(path)
Parse NGB file and return metadata and Arrow table.
Opens an NGB file, extracts all metadata and measurement data, and returns them as separate objects for flexible use.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path
|
str
|
Path to the .ngb-ss3 file to parse |
required |
Returns:
Type | Description |
---|---|
FileMetadata
|
Tuple of (metadata_dict, pyarrow_table) where: |
Table
|
|
tuple[FileMetadata, Table]
|
|
Raises:
Type | Description |
---|---|
FileNotFoundError
|
If the specified file doesn't exist |
NGBStreamNotFoundError
|
If required streams are missing |
NGBCorruptedFileError
|
If file structure is invalid |
BadZipFile
|
If file is not a valid ZIP archive |
Example
metadata, data = parser.parse("experiment.ngb-ss3") print(f"Instrument: {metadata.get('instrument', 'Unknown')}") print(f"Columns: {data.column_names}") print(f"Temperature range: {data['sample_temperature'].min()} to {data['sample_temperature'].max()}") Instrument: NETZSCH STA 449 F3 Jupiter Columns: ['time', 'sample_temperature', 'mass', 'dsc_signal', 'purge_flow'] Temperature range: 25.0 to 800.0