Hyperspectral Data Processing Using Python

Hyperspectral remote sensing technology provides unique advantages for fine classification and identification of ground objects by acquiring information across hundreds of continuous narrow bands. The Python ecosystem offers a rich set of libraries for processing hyperspectral data, and this article will comprehensively introduce these tools and their applications.

1. Data Reading Module

Hyperspectral data comes in various formats, requiring specialized libraries to read different types of data.

1.1 Spectral Library – ENVI Format Reading

Concept: Spectral is a Python library specifically designed for hyperspectral data processing, supporting reading and basic operations in the ENVI standard format.

Function: Provides input and output capabilities for hyperspectral data, supporting basic visualization and analysis.

import spectral as sp
import numpy as np
# Read ENVI format hyperspectral data
def read_envi_data(hdr_file):
    """
    Read ENVI format hyperspectral data
    """
    img = sp.open_image(hdr_file)
    data = img.load()
    print(f"Data shape: {data.shape}")
    print(f"Number of bands: {data.shape[2]}")
    return data
# Example usage
data = read_envi_data('hyperspectral_data.hdr')

1.2 h5py Library – HDF5 Format Reading

Concept: HDF5 is an efficient format for storing scientific data, and the h5py library provides a Python interface for reading and writing HDF5 files.

Function: Handles hyperspectral data stored in HDF5 format, supporting efficient reading and writing of large datasets.

import h5py

def read_hdf5_data(file_path, dataset_path):
    """
    Read HDF5 format hyperspectral data
    """
    with h5py.File(file_path, 'r') as f:
        data = f[dataset_path][:]
        if 'wavelengths' in f:
            wavelengths = f['wavelengths'][:]
        else:
            wavelengths = None
    print(f"HDF5 data shape: {data.shape}")
    return data, wavelengths
# Example usage
data, wavelengths = read_hdf5_data('data.h5', '/hyperspectral/data')

2. Data Preprocessing Module

Preprocessing is a critical step in hyperspectral analysis, including radiometric correction, noise removal, etc.

2.1 Radiometric Calibration and Atmospheric Correction

Concept: Converts digital quantization values to surface reflectance, eliminating atmospheric effects.

Function: Improves data quality, making data collected at different times and by different sensors comparable.

import numpy as np
from sklearn.preprocessing import StandardScaler

def radiometric_calibration(raw_data, gain, offset):
    """
    Radiometric calibration: Converts DN values to radiance
    """
    return raw_data * gain + offset

def atmospheric_correction(radiance_data, dark_object_value):
    """
    Simple atmospheric correction: Using dark object subtraction
    """
    return radiance_data - dark_object_value

def normalize_data(data):
    """
    Data normalization: Makes each band have a mean of 0 and a standard deviation of 1
    """
    original_shape = data.shape
    data_2d = data.reshape(-1, original_shape[2])
    scaler = StandardScaler()
    data_normalized = scaler.fit_transform(data_2d)
    return data_normalized.reshape(original_shape)
# Example usage
gain = 0.01  # Gain coefficient
offset = 0.5  # Offset
dark_object = 100  # Dark object value
radiance_data = radiometric_calibration(data, gain, offset)
reflectance_data = atmospheric_correction(radiance_data, dark_object)
normalized_data = normalize_data(reflectance_data)

2.2 Noise Removal and Spectral Smoothing

Concept: Uses filtering techniques to remove noise and smooth spectral curves.

Function: Improves signal-to-noise ratio and enhances spectral features.

from scipy import signal
from scipy.ndimage import uniform_filter

def savitzky_golay_smoothing(spectrum, window_size=11, order=2):
    """
    Savitzky-Golay filtering for spectral smoothing
    """
    return signal.savgol_filter(spectrum, window_size, order)

def wavelet_denoising(data, wavelet='db4', level=3):
    """
    Wavelet denoising
    """
    import pywt
    coeffs = pywt.wavedec(data, wavelet, level=level)
    sigma = np.median(np.abs(coeffs[-level])) / 0.6745
    uthresh = sigma * np.sqrt(2 * np.log(len(data)))
    coeffs = [pywt.threshold(c, uthresh, 'soft') for c in coeffs]
    return pywt.waverec(coeffs, wavelet)

def apply_spectral_smoothing(data, method='savgol', **kwargs):
    """
    Apply spectral smoothing
    """
    smoothed_data = np.zeros_like(data)
    for i in range(data.shape[0]):
        for j in range(data.shape[1]):
            spectrum = data[i, j, :]
            if method == 'savgol':
                smoothed_data[i, j, :] = savitzky_golay_smoothing(spectrum, **kwargs)
            elif method == 'wavelet':
                smoothed_data[i, j, :] = wavelet_denoising(spectrum, **kwargs)
    return smoothed_data
# Example usage
smoothed_data = apply_spectral_smoothing(data, method='savgol', window_size=11, order=2)

3. Spectral Visualization Module

Visualization is an important means of understanding hyperspectral data.

3.1 Spectral Curve Plotting

Concept: Plotting the spectral reflectance curves of individual or multiple pixels.

Function: Visually displays the spectral characteristics of ground objects for identification and analysis.

import matplotlib.pyplot as plt
import matplotlib.colors as mcolors

def plot_individual_spectrum(data, x, y, wavelengths, title=None):
    """
    Plot the spectral curve of a single pixel
    """
    spectrum = data[y, x, :]
    plt.figure(figsize=(12, 6))
    plt.plot(wavelengths, spectrum, 'b-', linewidth=1.5, label=f'Pixel ({x}, {y})')
    plt.xlabel('Wavelength (nm)', fontsize=12)
    plt.ylabel('Reflectance', fontsize=12)
    plt.title(title or f'Spectral Signature at Pixel ({x}, {y})', fontsize=14)
    plt.grid(True, alpha=0.3)
    plt.legend()
    plt.tight_layout()
    plt.show()

def plot_multiple_spectra(data, pixels, wavelengths, labels=None, colors=None):
    """
    Plot comparison of spectral curves for multiple pixels
    """
    if colors is None:
        colors = list(mcolors.TABLEAU_COLORS.values())
    if labels is None:
        labels = [f'Pixel ({x}, {y})' for x, y in pixels]
    plt.figure(figsize=(14, 8))
    for i, (x, y) in enumerate(pixels):
        spectrum = data[y, x, :]
        plt.plot(wavelengths, spectrum, color=colors[i % len(colors)],
                 linewidth=2, label=labels[i])
    plt.xlabel('Wavelength (nm)', fontsize=12)
    plt.ylabel('Reflectance', fontsize=12)
    plt.title('Comparison of Spectral Signatures', fontsize=16)
    plt.legend(bbox_to_anchor=(1.05, 1), loc='upper left')
    plt.grid(True, alpha=0.3)
    plt.tight_layout()
    plt.show()
# Example usage
pixels = [(100, 50), (150, 75), (200, 100)]  # Coordinates of three pixels
labels = ['Vegetation', 'Water', 'Urban']  # Corresponding land cover types
plot_multiple_spectra(data, pixels, wavelengths, labels=labels)

3.2 Hyperspectral Image Visualization

Concept: Visualizing high-dimensional hyperspectral data as 2D images.

Function: Visually displays spatial information and feature distribution.

def plot_rgb_composite(data, r_band, g_band, b_band, stretch=True):
    """
    Generate RGB true color composite image
    """
    rgb = data[:, :, [r_band, g_band, b_band]]
    if stretch:
        # 2% linear stretch
        p_low, p_high = np.percentile(rgb, (2, 98))
        rgb = np.clip((rgb - p_low) / (p_high - p_low), 0, 1)
    else:
        rgb = (rgb - rgb.min()) / (rgb.max() - rgb.min())
    plt.figure(figsize=(10, 10))
    plt.imshow(rgb)
    plt.axis('off')
    plt.title(f'RGB Composite (Bands: {r_band}, {g_band}, {b_band})')
    plt.tight_layout()
    plt.show()

def plot_principal_components(pca_result, n_components=3):
    """
    Display principal component analysis results
    """
    fig, axes = plt.subplots(1, n_components, figsize=(15, 5))
    for i in range(n_components):
        pc = pca_result[:, :, i]
        im = axes[i].imshow(pc, cmap='viridis')
        axes[i].set_title(f'PC {i+1}')
        axes[i].axis('off')
        plt.colorbar(im, ax=axes[i], shrink=0.8)
    plt.tight_layout()
    plt.show()
# Example usage
plot_rgb_composite(data, 29, 19, 9)  # Common false color band combination

4. Feature Extraction and Dimensionality Reduction Module

Hyperspectral data has high dimensionality, requiring dimensionality reduction to improve computational efficiency and model performance.

4.1 Principal Component Analysis (PCA)

Concept: Converts correlated variables into linearly uncorrelated principal components through orthogonal transformation.

Function: Reduces dimensionality, removes redundant information, and retains main features.

from sklearn.decomposition import PCA
from sklearn.preprocessing import StandardScaler

def apply_pca(data, n_components=10, explained_variance_threshold=0.95):
    """
    Apply PCA for dimensionality reduction
    """
    original_shape = data.shape
    data_2d = data.reshape(-1, original_shape[2])
    # Data normalization
    scaler = StandardScaler()
    data_scaled = scaler.fit_transform(data_2d)
    # Dynamically determine the number of principal components
    if explained_variance_threshold < 1:
        pca_temp = PCA()
        pca_temp.fit(data_scaled)
        cumulative_variance = np.cumsum(pca_temp.explained_variance_ratio_)
        n_components = np.argmax(cumulative_variance >= explained_variance_threshold) + 1
        print(f"Selected {n_components} principal components, explaining {cumulative_variance[n_components-1]:.3f} of the variance")
    # Apply PCA
    pca = PCA(n_components=n_components)
    pca_result = pca.fit_transform(data_scaled)
    # Reshape to image format
    pca_image = pca_result.reshape(original_shape[0], original_shape[1], n_components)
    print(f"Explained variance ratio for each principal component: {pca.explained_variance_ratio_}")
    print(f"Cumulative explained variance: {np.cumsum(pca.explained_variance_ratio_)}")
    return pca_image, pca, scaler
# Example usage
pca_result, pca_model, scaler = apply_pca(data, explained_variance_threshold=0.95)

4.2 Minimum Noise Fraction (MNF)

Concept: A transformation based on maximizing the signal-to-noise ratio, equivalent to two PCA transformations.

Function: Separates signal and noise, improving the accuracy of subsequent analyses.

from sklearn.decomposition import PCA
from sklearn.covariance import EmpiricalCovariance

def mnf_transform(data, n_components=10):
    """
    Apply MNF transformation
    """
    original_shape = data.shape
    data_2d = data.reshape(-1, original_shape[2])
    # Step 1: Noise covariance estimation and whitening
    noise_cov = EmpiricalCovariance().fit(data_2d).covariance_
    d, V = np.linalg.eigh(noise_cov)
    d = np.maximum(d, 1e-6)  # Avoid division by zero errors
    # Whitening transformation matrix
    whitening_matrix = V @ np.diag(1.0 / np.sqrt(d)) @ V.T
    data_whitened = data_2d @ whitening_matrix
    # Step 2: Apply PCA to whitened data
    pca = PCA(n_components=n_components)
    mnf_result = pca.fit_transform(data_whitened)
    # Reshape to image format
    mnf_image = mnf_result.reshape(original_shape[0], original_shape[1], n_components)
    return mnf_image, whitening_matrix, pca
# Example usage
mnf_result, whitening_matrix, mnf_pca = mnf_transform(data, n_components=10)

5. Image Classification and Recognition Module

Hyperspectral image classification is one of the core applications.

5.1 Support Vector Machine (SVM) Classification

Concept: Finds the optimal hyperplane to separate data points of different classes.

Function: Suitable for small sample classification of high-dimensional data, performing excellently in hyperspectral classification.

from sklearn.svm import SVC
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.metrics import accuracy_score, classification_report
import joblib

def prepare_classification_data(data, ground_truth):
    """
    Prepare classification data
    """
    mask = ground_truth > 0
    X = data[mask]
    y = ground_truth[mask]
    return X, y

def svm_classification(data, ground_truth, test_size=0.3, optimize_params=True):
    """
    SVM hyperspectral image classification
    """
    X, y = prepare_classification_data(data, ground_truth)
    # Split into training and testing sets
    X_train, X_test, y_train, y_test = train_test_split(
        X, y, test_size=test_size, random_state=42, stratify=y
    )
    # Parameter optimization
    if optimize_params:
        param_grid = {
            'C': [0.1, 1, 10, 100],
            'gamma': ['scale', 'auto', 0.01, 0.1],
            'kernel': ['rbf']
        }
        grid_search = GridSearchCV(
            SVC(random_state=42), param_grid, cv=5,
            scoring='accuracy', n_jobs=-1, verbose=1
        )
        grid_search.fit(X_train, y_train)
        best_params = grid_search.best_params_
        print(f"Best parameters: {best_params}")
        svm = grid_search.best_estimator_
    else:
        svm = SVC(C=1.0, kernel='rbf', gamma='scale', random_state=42)
        svm.fit(X_train, y_train)
    # Prediction and evaluation
    y_pred = svm.predict(X_test)
    accuracy = accuracy_score(y_test, y_pred)
    print(f"Classification accuracy: {accuracy:.4f}")
    print("\nDetailed classification report:")
    print(classification_report(y_test, y_pred))
    # Full image prediction
    full_prediction = svm.predict(data.reshape(-1, data.shape[2]))
    classification_map = full_prediction.reshape(data.shape[0], data.shape[1])
    return classification_map, svm, accuracy
# Example usage
classification_result, svm_model, accuracy = svm_classification(
    pca_result[:, :, :10],  # Use the first 10 principal components
    ground_truth_mask,
    optimize_params=True)

5.2 Random Forest Classification

Concept: A machine learning algorithm that integrates multiple decision trees for voting decisions.

Function: Handles high-dimensional data, resists overfitting, and provides feature importance assessment.

from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import cross_val_score

def random_forest_classification(data, ground_truth, n_estimators=100, max_features='auto'):
    """
    Random forest hyperspectral image classification
    """
    X, y = prepare_classification_data(data, ground_truth)
    # Create random forest classifier
    rf = RandomForestClassifier(
        n_estimators=n_estimators,
        max_features=max_features,
        max_depth=None,
        min_samples_split=2,
        random_state=42,
        n_jobs=-1,
        verbose=1
    )
    # Cross-validation
    cv_scores = cross_val_score(rf, X, y, cv=5, n_jobs=-1)
    print(f"Cross-validation accuracy: {cv_scores.mean():.4f} (±{cv_scores.std()*2:.4f})")
    # Train final model
    rf.fit(X, y)
    # Feature importance
    feature_importance = rf.feature_importances_
    print("Band importance ranking (top 10):")
    important_bands = np.argsort(feature_importance)[-10:][::-1]
    for i, band in enumerate(important_bands):
        print(f"{i+1}. Band {band}: {feature_importance[band]:.4f}")
    # Full image prediction
    full_prediction = rf.predict(data.reshape(-1, data.shape[2]))
    classification_map = full_prediction.reshape(data.shape[0], data.shape[1])
    return classification_map, rf, cv_scores.mean()
# Example usage
rf_result, rf_model, rf_accuracy = random_forest_classification(
    data,  # Use original data
    ground_truth_mask,
    n_estimators=200)

6. Deep Learning Processing Module

Deep learning provides new solutions for hyperspectral data processing.

6.1 1D-CNN Spectral Classification

Concept: Uses one-dimensional convolutional neural networks to process spectral sequence data.

Function: Automatically extracts spectral features, end-to-end learning classifier.

import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import Dataset, DataLoader, TensorDataset
from sklearn.model_selection import train_test_split

class SpectralCNN1D(nn.Module):
    """1D CNN hyperspectral classification network"""
    def __init__(self, input_channels, num_classes):
        super(SpectralCNN1D, self).__init__()
        self.features = nn.Sequential(
            nn.Conv1d(1, 32, kernel_size=5, stride=1, padding=2),
            nn.BatchNorm1d(32),
            nn.ReLU(),
            nn.MaxPool1d(kernel_size=2, stride=2),
            nn.Conv1d(32, 64, kernel_size=5, stride=1, padding=2),
            nn.BatchNorm1d(64),
            nn.ReLU(),
            nn.MaxPool1d(kernel_size=2, stride=2),
            nn.Conv1d(64, 128, kernel_size=3, stride=1, padding=1),
            nn.BatchNorm1d(128),
            nn.ReLU(),
            nn.AdaptiveAvgPool1d(1)
        )
        self.classifier = nn.Sequential(
            nn.Dropout(0.5),
            nn.Linear(128, 64),
            nn.ReLU(),
            nn.Dropout(0.3),
            nn.Linear(64, num_classes)
        )
    def forward(self, x):
        x = x.unsqueeze(1)  # Add channel dimension
        x = self.features(x)
        x = x.view(x.size(0), -1)
        x = self.classifier(x)
        return x

def train_cnn_model(X, y, num_bands, num_classes, epochs=100, batch_size=32):
    """
    Train 1D-CNN model
    """
    # Data preparation
    X_train, X_test, y_train, y_test = train_test_split(
        X, y, test_size=0.3, random_state=42, stratify=y
    )
    # Convert to Tensor
    X_train_tensor = torch.FloatTensor(X_train)
    y_train_tensor = torch.LongTensor(y_train)
    X_test_tensor = torch.FloatTensor(X_test)
    y_test_tensor = torch.LongTensor(y_test)
    # Create data loader
    train_dataset = TensorDataset(X_train_tensor, y_train_tensor)
    train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
    # Initialize model
    model = SpectralCNN1D(num_bands, num_classes)
    criterion = nn.CrossEntropyLoss()
    optimizer = optim.Adam(model.parameters(), lr=0.001, weight_decay=1e-4)
    scheduler = optim.lr_scheduler.StepLR(optimizer, step_size=30, gamma=0.1)
    # Training loop
    train_losses, test_accuracies = [], []
    for epoch in range(epochs):
        model.train()
        running_loss = 0.0
        for batch_X, batch_y in train_loader:
            optimizer.zero_grad()
            outputs = model(batch_X)
            loss = criterion(outputs, batch_y)
            loss.backward()
            optimizer.step()
            running_loss += loss.item()
        scheduler.step()
        # Validation
        model.eval()
        with torch.no_grad():
            test_outputs = model(X_test_tensor)
            _, predicted = torch.max(test_outputs, 1)
            accuracy = (predicted == y_test_tensor).float().mean()
        train_losses.append(running_loss / len(train_loader))
        test_accuracies.append(accuracy.item())
        if (epoch + 1) % 10 == 0:
            print(f'Epoch [{epoch+1}/{epochs}], Loss: {train_losses[-1]:.4f}, Accuracy: {accuracy:.4f}')
    return model, train_losses, test_accuracies
# Example usage
X, y = prepare_classification_data(data, ground_truth_mask)
cnn_model, losses, accuracies = train_cnn_model(X, y, data.shape[2], len(np.unique(y)))

7. Accuracy Assessment and Validation Module

Scientific accuracy assessment is an important part of hyperspectral analysis.

7.1 Comprehensive Classification Accuracy Assessment

Concept: Uses multiple metrics to comprehensively assess the accuracy of classification results.

Function: Objectively evaluates classification performance and guides model optimization.

from sklearn.metrics import (confusion_matrix, cohen_kappa_score,
                            f1_score, precision_score, recall_score)
import seaborn as sns

def comprehensive_evaluation(y_true, y_pred, class_names):
    """
    Comprehensive classification accuracy assessment
    """
    # Basic metrics
    overall_accuracy = accuracy_score(y_true, y_pred)
    kappa = cohen_kappa_score(y_true, y_pred)
    f1 = f1_score(y_true, y_pred, average='weighted')
    precision = precision_score(y_true, y_pred, average='weighted')
    recall = recall_score(y_true, y_pred, average='weighted')
    # Confusion matrix
    cm = confusion_matrix(y_true, y_pred)
    # Producer accuracy and user accuracy
    producer_accuracy = np.diag(cm) / np.sum(cm, axis=1)
    user_accuracy = np.diag(cm) / np.sum(cm, axis=0)
    # Print results
    print("=" * 50)
    print("Classification Accuracy Assessment Report")
    print("=" * 50)
    print(f"Overall Accuracy (OA): {overall_accuracy:.4f}")
    print(f"Kappa Coefficient: {kappa:.4f}")
    print(f"F1 Score: {f1:.4f}")
    print(f"Precision: {precision:.4f}")
    print(f"Recall: {recall:.4f}")
    print("\nPer Class Accuracy:")
    for i, class_name in enumerate(class_names):
        print(f"{class_name}: "
              f"Producer Accuracy={producer_accuracy[i]:.4f}, "
              f"User Accuracy={user_accuracy[i]:.4f}")
    # Plot confusion matrix
    plt.figure(figsize=(10, 8))
    sns.heatmap(cm, annot=True, fmt='d', cmap='Blues',
                 xticklabels=class_names, yticklabels=class_names)
    plt.xlabel('Predicted Label', fontsize=12)
    plt.ylabel('True Label', fontsize=12)
    plt.title('Confusion Matrix', fontsize=14)
    plt.tight_layout()
    plt.show()
    return {
        'overall_accuracy': overall_accuracy,
        'kappa': kappa,
        'f1_score': f1,
        'precision': precision,
        'recall': recall,
        'producer_accuracy': producer_accuracy,
        'user_accuracy': user_accuracy,
        'confusion_matrix': cm
    }
# Example usage
evaluation_results = comprehensive_evaluation(
    y_test,
    y_pred,
    class_names=['Water', 'Vegetation', 'Building', 'Bare Soil', 'Road'])

7.2 Cross-Validation and Statistical Testing

Concept Function: Ensures the reliability of assessment results and scientifically compares the performance of different algorithms.: Uses cross-validation to assess model stability and statistical testing to compare model performance.

from sklearn.model_selection import cross_val_score, StratifiedKFold
from scipy import stats

def cross_validation_comparison(X, y, models, cv_folds=5):
    """
    Cross-validation comparison of multiple models
    """
    cv = StratifiedKFold(n_splits=cv_folds, shuffle=True, random_state=42)
    results = {}
    for name, model in models.items():
        scores = cross_val_score(model, X, y, cv=cv, scoring='accuracy', n_jobs=-1)
        results[name] = scores
        print(f"{name}: {scores.mean():.4f} (±{scores.std()*2:.4f})")
    # Statistical significance testing
    model_names = list(results.keys())
    for i in range(len(model_names)):
        for j in range(i + 1, len(model_names)):
            t_stat, p_value = stats.ttest_rel(
                results[model_names[i]],
                results[model_names[j]]
            )
            print(f"{model_names[i]} vs {model_names[j]}: p-value = {p_value:.4f}")
    return results
# Example usage
models = {
    'SVM': SVC(C=1.0, kernel='rbf', gamma='scale', random_state=42),
    'RandomForest': RandomForestClassifier(n_estimators=100, random_state=42),
    'KNN': KNeighborsClassifier(n_neighbors=5)}
cv_results = cross_validation_comparison(X, y, models)

8. Advanced Application Module

8.1 Endmember Extraction and Spectral Unmixing

Concept: Extracts pure land cover spectra (endmembers) from mixed pixels and calculates their proportions (abundance).

Function: Addresses the mixed pixel problem and improves land cover identification accuracy.

from scipy.optimize import nnls
from sklearn.decomposition import NMF

def vca_endmember_extraction(data, num_endmembers):
    """
    Vertex Component Analysis (VCA) endmember extraction
    """
    # Simplified VCA algorithm implementation
    # Data preprocessing and projection
    data_2d = data.reshape(-1, data.shape[2])
    data_2d = data_2d - np.mean(data_2d, axis=0)
    # SVD decomposition
    U, S, Vt = np.linalg.svd(data_2d.T, full_matrices=False)
    # Project onto signal subspace
    projection = U[:, :num_endmembers].T
    projected_data = projection @ data_2d.T
    # Find endmembers
    endmembers = []
    indices = []
    # Find extreme points in the projection space
    for _ in range(num_endmembers):
        # Find the point with the largest norm in the projection space
        norms = np.linalg.norm(projected_data, axis=0)
        idx = np.argmax(norms)
        indices.append(idx)
        # Extract endmember
        endmember = data_2d[idx]
        endmembers.append(endmember)
        # Orthogonalize projection
        if len(endmembers) > 1:
            u = projection @ endmember
            for j in range(len(endmembers) - 1):
                u = u - (u @ endmembers[j]) * endmembers[j]
            projected_data = projected_data - np.outer(u, u.T @ projected_data) / np.dot(u, u)
    return np.array(endmembers), indices

def fully_constrained_unmixing(data, endmembers):
    """
    Fully constrained least squares spectral unmixing
    """
    original_shape = data.shape
    data_2d = data.reshape(-1, original_shape[2])
    abundances = np.zeros((data_2d.shape[0], endmembers.shape[0]))
    for i in range(data_2d.shape[0]):
        # Non-negative least squares unmixing
        abundance, _ = nnls(endmembers.T, data_2d[i])
        # Sum-to-one constraint
        if np.sum(abundance) > 0:
            abundance = abundance / np.sum(abundance)
        abundances[i] = abundance
    # Reshape to image format
    abundance_maps = abundances.reshape(original_shape[0], original_shape[1], endmembers.shape[0])
    return abundance_maps
# Example usage
endmembers, indices = vca_endmember_extraction(data, num_endmembers=5)
abundance_maps = fully_constrained_unmixing(data, endmembers)

Conclusion

Python provides a complete toolchain for hyperspectral data processing. This article introduces the complete workflow from data reading to advanced applications:

Data Reading: Libraries like spectral and h5py support various formats
Preprocessing: Radiometric calibration, atmospheric correction, noise removal
Visualization: Spectral curves and image displays
Feature Extraction: Dimensionality reduction methods like PCA and MNF
Classification and Recognition: Traditional machine learning and deep learning methods
Accuracy Assessment: Multi-metric comprehensive evaluation system
Advanced Applications: Endmember extraction and spectral unmixing

These tools and methods provide strong technical support for hyperspectral data analysis, allowing researchers to choose appropriate methods based on specific needs. It is recommended to combine multiple methods in practical applications and conduct comparative experiments to achieve the best results.

Note: The code examples in this article require the installation of the corresponding Python libraries. It is recommended to create a dedicated environment for hyperspectral data processing using Anaconda.

Source: 无事过一夏