Handlers Module¶
gigaspatial.handlers ¶
base ¶
BaseHandler ¶
Bases: ABC
Abstract base class that orchestrates configuration, downloading, and reading functionality.
This class serves as the main entry point for dataset handlers, providing a unified interface for data acquisition and loading. It manages the lifecycle of config, downloader, and reader components.
Subclasses should implement the abstract methods to provide specific handler types and define how components are created and interact.
Source code in gigaspatial/handlers/base.py
514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 | |
config: BaseHandlerConfig property ¶
Get the configuration object.
downloader: BaseHandlerDownloader property ¶
Get the downloader object.
reader: BaseHandlerReader property ¶
Get the reader object.
__enter__() ¶
__exit__(exc_type, exc_val, exc_tb) ¶
__init__(config=None, downloader=None, reader=None, data_store=None, logger=None) ¶
Initialize the BaseHandler with optional components.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config | Optional[BaseHandlerConfig] | Configuration object. If None, will be created via create_config() | None |
downloader | Optional[BaseHandlerDownloader] | Downloader instance. If None, will be created via create_downloader() | None |
reader | Optional[BaseHandlerReader] | Reader instance. If None, will be created via create_reader() | None |
data_store | Optional[DataStore] | Data store instance. Defaults to LocalDataStore if not provided | None |
logger | Optional[Logger] | Logger instance. If not provided, creates one based on class name | None |
Source code in gigaspatial/handlers/base.py
__repr__() ¶
String representation of the handler.
Source code in gigaspatial/handlers/base.py
cleanup() ¶
Cleanup resources used by the handler.
Override in subclasses if specific cleanup is needed.
create_config(data_store, logger, **kwargs) abstractmethod ¶
Create and return a configuration object for this handler.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data_store | DataStore | The data store instance to use | required |
logger | Logger | The logger instance to use | required |
**kwargs | Additional configuration parameters | {} |
Returns:
| Type | Description |
|---|---|
BaseHandlerConfig | Configured BaseHandlerConfig instance |
Source code in gigaspatial/handlers/base.py
create_downloader(config, data_store, logger, **kwargs) abstractmethod ¶
Create and return a downloader object for this handler.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config | BaseHandlerConfig | The configuration object | required |
data_store | DataStore | The data store instance to use | required |
logger | Logger | The logger instance to use | required |
**kwargs | Additional downloader parameters | {} |
Returns:
| Type | Description |
|---|---|
BaseHandlerDownloader | Configured BaseHandlerDownloader instance |
Source code in gigaspatial/handlers/base.py
create_reader(config, data_store, logger, **kwargs) abstractmethod ¶
Create and return a reader object for this handler.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config | BaseHandlerConfig | The configuration object | required |
data_store | DataStore | The data store instance to use | required |
logger | Logger | The logger instance to use | required |
**kwargs | Additional reader parameters | {} |
Returns:
| Type | Description |
|---|---|
BaseHandlerReader | Configured BaseHandlerReader instance |
Source code in gigaspatial/handlers/base.py
download_and_load(source, crop_to_source=False, force_download=False, **kwargs) ¶
Convenience method to download (if needed) and load data in one call.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source | Union[str, List[Union[tuple, Point]], BaseGeometry, GeoDataFrame, Path, List[Union[str, Path]]] | The data source specification | required |
force_download | bool | If True, download even if data exists locally | False |
**kwargs | Additional parameters | {} |
Returns:
| Type | Description |
|---|---|
Any | Loaded data |
Source code in gigaspatial/handlers/base.py
ensure_data_available(source, force_download=False, **kwargs) ¶
Ensure that data is available for the given source.
This method checks if the required data exists locally, and if not (or if force_download is True), downloads it using the downloader.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source | Union[str, List[Union[tuple, Point]], BaseGeometry, GeoDataFrame, Path, List[Union[str, Path]]] | The data source specification | required |
force_download | bool | If True, download even if data exists locally | False |
**kwargs | Additional parameters passed to download methods | {} |
Returns:
| Name | Type | Description |
|---|---|---|
bool | bool | True if data is available after this operation |
Source code in gigaspatial/handlers/base.py
649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 | |
get_available_data_info(source, **kwargs) ¶
Get information about available data for the given source.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source | Union[str, List[Union[tuple, Point]], BaseGeometry, GeoDataFrame] | The data source specification | required |
**kwargs | Additional parameters | {} |
Returns:
| Name | Type | Description |
|---|---|---|
dict | dict | Information about data availability, paths, etc. |
Source code in gigaspatial/handlers/base.py
load_data(source, crop_to_source=False, ensure_available=True, **kwargs) ¶
Load data from the given source.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source | Union[str, List[Union[tuple, Point]], BaseGeometry, GeoDataFrame, Path, List[Union[str, Path]]] | The data source specification | required |
ensure_available | bool | If True, ensure data is downloaded before loading | True |
**kwargs | Additional parameters passed to load methods | {} |
Returns:
| Type | Description |
|---|---|
Any | Loaded data (type depends on specific handler implementation) |
Source code in gigaspatial/handlers/base.py
BaseHandlerConfig dataclass ¶
Bases: ABC
Abstract base class for handler configuration objects. Provides standard fields for path, parallelism, data store, and logger. Extend this class for dataset-specific configuration.
Source code in gigaspatial/handlers/base.py
20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 | |
clear_unit_cache() ¶
extract_search_geometry(source, **kwargs) ¶
General method to extract a canonical geometry from supported source types.
Notes¶
- For
gpd.GeoDataFrameinputs, an optionalcrskeyword argument is interpreted as the target CRS. If provided and different fromsource.crs, the GeoDataFrame is reprojected before unioning the geometry. - For bare geometries and point collections, no CRS transformation is performed here; subclasses can add handler-specific CRS normalization if needed (see e.g.
GHSLDataConfig).
Source code in gigaspatial/handlers/base.py
get_data_unit_path(unit, **kwargs) abstractmethod ¶
get_data_unit_paths(units, **kwargs) ¶
Given data unit identifiers, return the corresponding file paths.
Source code in gigaspatial/handlers/base.py
get_relevant_data_units_by_geometry(geometry, **kwargs) abstractmethod ¶
Given a geometry, return a list of relevant data unit identifiers (e.g., tiles, files, resources).
Source code in gigaspatial/handlers/base.py
BaseHandlerDownloader ¶
Bases: ABC
Abstract base class for handler downloader classes. Standardizes config, data_store, and logger initialization. Extend this class for dataset-specific downloaders.
Source code in gigaspatial/handlers/base.py
BaseHandlerReader ¶
Bases: ABC
Abstract base class for handler reader classes. Provides common methods for resolving source paths and loading data. Supports resolving by country, points, geometry, GeoDataFrame, or explicit paths. Includes generic loader functions for raster and tabular data.
Source code in gigaspatial/handlers/base.py
222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 | |
load(source, crop_to_source=False, **kwargs) ¶
Load data from the given source.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source | Union[str, List[Union[Tuple[float, float], Point]], BaseGeometry, GeoDataFrame, Path, str, List[Union[str, Path]]] | The data source (country code/name, points, geometry, paths, etc.). | required |
crop_to_source | bool, default False If True, crop loaded data to the exact source geometry | False | |
**kwargs | Additional parameters to pass to the loading process. | {} |
Returns:
| Type | Description |
|---|---|
Any | The loaded data. The type depends on the subclass implementation. |
Source code in gigaspatial/handlers/base.py
load_from_paths(source_data_path, **kwargs) abstractmethod ¶
Abstract method to load source data from paths.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source_data_path | List[Union[str, Path]] | List of source paths | required |
**kwargs | Additional parameters for data loading | {} |
Returns:
| Type | Description |
|---|---|
Any | Loaded data (DataFrame, GeoDataFrame, etc.) |
Source code in gigaspatial/handlers/base.py
resolve_by_paths(paths, **kwargs) ¶
Return explicit paths as a list.
Source code in gigaspatial/handlers/base.py
resolve_source_paths(source, **kwargs) ¶
Resolve source data paths based on the type of source input.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source | Union[str, List[Union[Tuple[float, float], Point]], BaseGeometry, GeoDataFrame, Path, str, List[Union[str, Path]]] | Can be a country code or name (str), list of points, geometry, GeoDataFrame, or explicit path(s) | required |
**kwargs | Additional parameters for path resolution | {} |
Returns:
| Type | Description |
|---|---|
List[Union[str, Path]] | List of resolved source paths |
Source code in gigaspatial/handlers/base.py
boundaries ¶
AdminBoundaries ¶
Bases: BaseModel
Base class for administrative boundary data with flexible fields.
Source code in gigaspatial/handlers/boundaries.py
39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 | |
create(country_code=None, admin_level=0, data_store=None, path=None, **kwargs) classmethod ¶
Factory method to create an AdminBoundaries instance using various data sources, depending on the provided parameters and global configuration.
Loading Logic
-
If a
data_storeis provided and either apathis given orglobal_config.ADMIN_BOUNDARIES_DATA_DIRis set:- If
pathis not provided butcountry_codeis, the path is constructed usingglobal_config.get_admin_path(). - Loads boundaries from the specified data store and path.
- If
-
If only
country_codeis provided (no data_store):- Attempts to load boundaries from GeoRepo (if available).
- If GeoRepo is unavailable, attempts to load from GADM.
- If GADM fails, falls back to geoBoundaries.
- Raises an error if all sources fail.
-
If neither
country_codenordata_storeis provided:- Raises a ValueError.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
country_code | Optional[str] | ISO country code (2 or 3 letter) or country name. | None |
admin_level | int | Administrative level (0=country, 1=state/province, etc.). | 0 |
data_store | Optional[DataStore] | Optional data store instance for loading from existing data. | None |
path | Optional[Union[str, Path]] | Optional path to data file (used with data_store). | None |
**kwargs | Additional arguments passed to the underlying creation methods. | {} |
Returns:
| Name | Type | Description |
|---|---|---|
AdminBoundaries | AdminBoundaries | Configured instance. |
Raises:
| Type | Description |
|---|---|
ValueError | If neither country_code nor (data_store, path) are provided, or if country_code lookup fails. |
RuntimeError | If all data sources fail to load boundaries. |
Examples:
Load from a data store (path auto-generated if not provided)¶
boundaries = AdminBoundaries.create(country_code="USA", admin_level=1, data_store=store)
Load from a specific file in a data store¶
boundaries = AdminBoundaries.create(data_store=store, path="data.shp")
Load from online sources (GeoRepo, GADM, geoBoundaries)¶
boundaries = AdminBoundaries.create(country_code="USA", admin_level=1)
Source code in gigaspatial/handlers/boundaries.py
344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 | |
from_data_store(data_store, path, admin_level=0, **kwargs) classmethod ¶
Load and create instance from internal data store.
Source code in gigaspatial/handlers/boundaries.py
from_gadm(country_code, admin_level=0, **kwargs) classmethod ¶
Load and create instance from GADM data.
Source code in gigaspatial/handlers/boundaries.py
from_georepo(country_code=None, admin_level=0, **kwargs) classmethod ¶
Load and create instance from GeoRepo (UNICEF) API.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
country | Country name (if using name-based lookup) | required | |
iso3 | ISO3 code (if using code-based lookup) | required | |
admin_level | int | Administrative level (0=country, 1=state, etc.) | 0 |
api_key | GeoRepo API key (optional) | required | |
email | GeoRepo user email (optional) | required | |
kwargs | Extra arguments (ignored) | {} |
Returns:
| Type | Description |
|---|---|
AdminBoundaries | AdminBoundaries instance |
Source code in gigaspatial/handlers/boundaries.py
from_global_country_boundaries(scale='medium') classmethod ¶
Load global country boundaries from Natural Earth Data.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
scale | str | One of 'large', 'medium', 'small'. - 'large' -> 10m - 'medium' -> 50m - 'small' -> 110m | 'medium' |
Source code in gigaspatial/handlers/boundaries.py
get_schema_config() classmethod ¶
to_geodataframe() ¶
Convert the AdminBoundaries to a GeoDataFrame.
Source code in gigaspatial/handlers/boundaries.py
AdminBoundary ¶
Bases: BaseModel
Base class for administrative boundary data with flexible fields.
Source code in gigaspatial/handlers/boundaries.py
gee ¶
GEEConfig dataclass ¶
Configuration class for Google Earth Engine Handler operations.
This config manages dataset metadata, authentication, and processing parameters for GEE operations within GigaSpatial.
Source code in gigaspatial/handlers/gee/config.py
20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 | |
__post_init__() ¶
Validate and normalize configuration after initialization.
Source code in gigaspatial/handlers/gee/config.py
from_dataset_id(dataset_id, registry=None, **overrides) classmethod ¶
Create config from dataset ID using registry.
Parameters¶
dataset_id : str Dataset identifier registry : GEEDatasetRegistry, optional Custom registry. If None, uses built-in registry. **overrides Override any config parameters
Source code in gigaspatial/handlers/gee/config.py
get_ee_reducer() ¶
Get the appropriate Earth Engine Reducer based on config.
Returns¶
ee.Reducer Earth Engine reducer object
Source code in gigaspatial/handlers/gee/config.py
to_dict() ¶
update(**kwargs) ¶
Update configuration parameters in place.
Source code in gigaspatial/handlers/gee/config.py
GEEDatasetEntry dataclass ¶
Base class for GEE dataset registry entries.
Defines the schema and validation for dataset metadata.
Source code in gigaspatial/handlers/gee/datasets/base.py
GEEDatasetRegistry ¶
Manages GEE dataset registry with built-in and custom datasets.
Automatically loads built-in datasets and allows users to add custom ones.
Source code in gigaspatial/handlers/gee/registry.py
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 | |
__contains__(dataset_id) ¶
__init__(custom_registry=None) ¶
Initialize registry with built-in datasets.
Parameters¶
custom_registry : dict, optional User-provided custom dataset registry to merge with built-in
Source code in gigaspatial/handlers/gee/registry.py
__len__() ¶
add_custom_datasets(custom_datasets) ¶
Add or override datasets with custom definitions.
Parameters¶
custom_datasets : dict Dictionary of dataset_id -> GEEDatasetEntry or dict
Source code in gigaspatial/handlers/gee/registry.py
get(dataset_id) ¶
Get dataset entry by ID.
Parameters¶
dataset_id : str Dataset identifier
Returns¶
GEEDatasetEntry Dataset metadata entry
Source code in gigaspatial/handlers/gee/registry.py
get_datasets_by_cadence(cadence) ¶
Get datasets by temporal cadence.
get_datasets_by_source(source) ¶
Get all datasets from a specific source (e.g., 'NOAA', 'NASA').
Source code in gigaspatial/handlers/gee/registry.py
list_datasets() ¶
load_from_json(filepath) classmethod ¶
Load registry from JSON file.
save_to_json(filepath) ¶
search(keyword) ¶
Search datasets by keyword in name or description.
Parameters¶
keyword : str Search keyword
Returns¶
list Matching dataset IDs
Source code in gigaspatial/handlers/gee/registry.py
GEEProfiler ¶
Google Earth Engine profiler for inspecting and mapping datasets.
Provides comprehensive functionality to: - Inspect GEE collections (bands, dates, properties, metadata) - Map values to point locations with optional buffers - Map values to polygon zones with spatial aggregation - Extract temporal profiles and time series - Download data for offline use
Examples¶
Initialize with dataset ID (uses built-in registry)¶
profiler = GEEProfiler(dataset_id="nightlights") profiler.display_collection_info()
Map to schools with buffers¶
enriched = profiler.map_to_points( ... gdf=schools, ... band="avg_rad", ... reducer="mean", ... buffer_radius_m=1000, ... start_date="2020-01-01", ... end_date="2020-12-31" ... )
Map to admin zones¶
zones = profiler.map_to_zones( ... gdf=admin_boundaries, ... band="avg_rad", ... reducer="sum" ... )
Source code in gigaspatial/handlers/gee/profiler.py
19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 | |
feature_collection: Union[gpd.GeoDataFrame, ee.FeatureCollection] property writable ¶
Get the current feature collection.
image_collection: Union[ee.ImageCollection, ee.Image] property writable ¶
Get the current image collection.
__init__(dataset_id=None, collection=None, service_account=globalconfig.GOOGLE_SERVICE_ACCOUNT, key_path=globalconfig.GOOGLE_SERVICE_ACCOUNT_KEY_PATH, project_id=globalconfig.GOOGLE_CLOUD_PROJECT, data_store=None, **config_overrides) ¶
Initialize GEE profiler.
Parameters¶
dataset_id : str, optional Dataset ID from built-in registry (e.g., "nightlights", "population"). If provided, automatically loads collection and metadata. collection : str, ee.ImageCollection, or ee.Image, optional Direct GEE collection ID or object. Overrides dataset_id if both provided. service_account : str, optional Google service account email for authentication key_path : str, optional Path to service account JSON key file project_id : str, optional Google Cloud project ID **config_overrides Additional configuration parameters to override defaults (e.g., band="avg_rad", reducer="median", scale=1000)
Examples¶
Use built-in dataset¶
profiler = GEEProfiler(dataset_id="nightlights")
Use custom collection¶
profiler = GEEProfiler(collection="NOAA/VIIRS/DNB/MONTHLY_V1/VCMCFG")
With authentication¶
profiler = GEEProfiler( ... dataset_id="nightlights", ... service_account="my-account@project.iam.gserviceaccount.com", ... key_path="/path/to/key.json" ... )
Source code in gigaspatial/handlers/gee/profiler.py
display_band_names() ¶
Print all available band names.
Examples¶
profiler.display_band_names() Available bands (2): 1. avg_rad 2. cf_cvg
Source code in gigaspatial/handlers/gee/profiler.py
display_collection_info() ¶
Print comprehensive information about the collection.
Examples¶
>>> profiler.display_collection_info()¶
GEE Collection Information¶
Dataset ID: nightlights Collection: NOAA/VIIRS/DNB/MONTHLY_V1/VCMCFG Type: ImageCollection Number of images: 154
Bands: Available bands (2): 1. avg_rad 2. cf_cvg
Temporal coverage: Date range: From: 2012-04-01 To: 2024-12-01
Configuration
Default reducer: mean Temporal cadence: monthly
============================================================
Source code in gigaspatial/handlers/gee/profiler.py
display_date_range(date_format='%Y-%m-%d') ¶
Print the date range of the ImageCollection.
Parameters¶
date_format : str Date format string (default: "%Y-%m-%d")
Examples¶
profiler.display_date_range() Date range: From: 2012-04-01 To: 2024-12-01
Source code in gigaspatial/handlers/gee/profiler.py
display_properties(image_index=0) ¶
Print properties of an image.
Parameters¶
image_index : int Index of image to inspect (default: 0)
Source code in gigaspatial/handlers/gee/profiler.py
get_band_names() ¶
Get all band names from the collection or image.
Returns¶
list List of band names
Examples¶
profiler = GEEProfiler(dataset_id="nightlights") bands = profiler.get_band_names() print(bands) ['avg_rad', 'cf_cvg']
Source code in gigaspatial/handlers/gee/profiler.py
get_date_range(date_format='%Y-%m-%d') ¶
Get the date range of an ImageCollection.
Parameters¶
date_format : str Date format string (default: "%Y-%m-%d")
Returns¶
dict Dictionary with 'min' and 'max' dates
Raises¶
TypeError If collection is an Image (not ImageCollection)
Examples¶
date_range = profiler.get_date_range() print(date_range)
Source code in gigaspatial/handlers/gee/profiler.py
get_properties(image_index=0) ¶
Get properties of an image from the collection.
Parameters¶
image_index : int Index of image to inspect (default: 0 for first image)
Returns¶
dict Image properties
Source code in gigaspatial/handlers/gee/profiler.py
map_to_points(gdf, band=None, reducer=None, buffer_radius_m=None, start_date=None, end_date=None, temporal_reducer=None, chunk_size=None, scale=None) ¶
Map GEE values to point locations with optional circular buffers.
This method extracts values from the Earth Engine image/collection at point locations. If buffer_radius_m is provided, circular buffers are created around points and spatial reduction is applied within each buffer.
Parameters¶
gdf : GeoDataFrame Point locations to enrich (must have Point geometries) band : str, optional Band to extract (uses config default if None) reducer : str, optional Spatial reducer: mean, median, min, max, sum, etc. (uses config default if None) buffer_radius_m : float, optional Buffer radius in meters around each point. If None or 0, point values are extracted directly. start_date : str, optional Start date YYYY-MM-DD for temporal filtering (uses config if None) end_date : str, optional End date YYYY-MM-DD for temporal filtering (uses config if None) temporal_reducer : str, optional How to aggregate over time if multiple images: mean, median, max, etc. (uses config default if None) chunk_size : int, optional Features per chunk for API rate limiting (uses config default if None) scale : float, optional Spatial resolution in meters (uses config/dataset default if None)
Returns¶
GeoDataFrame Enriched GeoDataFrame with new column: {band}_{reducer}
Examples¶
Extract nightlight values at school points with 1km buffers¶
profiler = GEEProfiler(dataset_id="nightlights") enriched = profiler.map_to_points( ... gdf=schools, ... band="avg_rad", ... reducer="mean", ... buffer_radius_m=1000, ... start_date="2020-01-01", ... end_date="2020-12-31" ... ) print(enriched[["school_id", "avg_rad_mean"]].head())
Source code in gigaspatial/handlers/gee/profiler.py
778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 | |
map_to_zones(gdf, band=None, reducer=None, start_date=None, end_date=None, temporal_reducer=None, chunk_size=None, scale=None) ¶
Map GEE values to polygon zones with spatial aggregation.
This method aggregates values from the Earth Engine image/collection within polygon boundaries (e.g., administrative zones, grid cells).
Parameters¶
gdf : GeoDataFrame Polygon zones to aggregate over band : str, optional Band to extract (uses config default if None) reducer : str, optional Spatial reducer: mean, median, min, max, sum, etc. (uses config default if None) start_date : str, optional Start date YYYY-MM-DD for temporal filtering end_date : str, optional End date YYYY-MM-DD for temporal filtering temporal_reducer : str, optional How to aggregate over time: mean, median, max, etc. chunk_size : int, optional Features per chunk for API rate limiting scale : float, optional Spatial resolution in meters
Returns¶
GeoDataFrame Enriched GeoDataFrame with new column: {band}_{reducer}
Examples¶
Aggregate population within admin boundaries¶
profiler = GEEProfiler(dataset_id="population") zones_enriched = profiler.map_to_zones( ... gdf=admin_boundaries, ... band="population_density", ... reducer="sum", ... start_date="2020-01-01", ... end_date="2020-12-31" ... ) print(zones_enriched[["admin_name", "population_density_sum"]].head())
Source code in gigaspatial/handlers/gee/profiler.py
912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 | |
validate_band(band_name) ¶
validate_date_range(start_date, end_date) ¶
Check if requested dates are within collection's date range.
Parameters¶
start_date : str Start date in YYYY-MM-DD format end_date : str End date in YYYY-MM-DD format
Returns¶
bool True if dates are valid
Source code in gigaspatial/handlers/gee/profiler.py
config ¶
GEEConfig dataclass ¶
Configuration class for Google Earth Engine Handler operations.
This config manages dataset metadata, authentication, and processing parameters for GEE operations within GigaSpatial.
Source code in gigaspatial/handlers/gee/config.py
20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 | |
__post_init__() ¶
Validate and normalize configuration after initialization.
Source code in gigaspatial/handlers/gee/config.py
from_dataset_id(dataset_id, registry=None, **overrides) classmethod ¶
Create config from dataset ID using registry.
Parameters¶
dataset_id : str Dataset identifier registry : GEEDatasetRegistry, optional Custom registry. If None, uses built-in registry. **overrides Override any config parameters
Source code in gigaspatial/handlers/gee/config.py
get_ee_reducer() ¶
Get the appropriate Earth Engine Reducer based on config.
Returns¶
ee.Reducer Earth Engine reducer object
Source code in gigaspatial/handlers/gee/config.py
to_dict() ¶
update(**kwargs) ¶
Update configuration parameters in place.
Source code in gigaspatial/handlers/gee/config.py
get_default_registry() ¶
Get or create the default registry.
datasets ¶
GEEDatasetEntry dataclass ¶
Base class for GEE dataset registry entries.
Defines the schema and validation for dataset metadata.
Source code in gigaspatial/handlers/gee/datasets/base.py
base ¶
GEEDatasetEntry dataclass ¶
Base class for GEE dataset registry entries.
Defines the schema and validation for dataset metadata.
Source code in gigaspatial/handlers/gee/datasets/base.py
profiler ¶
GEEProfiler ¶
Google Earth Engine profiler for inspecting and mapping datasets.
Provides comprehensive functionality to: - Inspect GEE collections (bands, dates, properties, metadata) - Map values to point locations with optional buffers - Map values to polygon zones with spatial aggregation - Extract temporal profiles and time series - Download data for offline use
Examples¶
Initialize with dataset ID (uses built-in registry)¶
profiler = GEEProfiler(dataset_id="nightlights") profiler.display_collection_info()
Map to schools with buffers¶
enriched = profiler.map_to_points( ... gdf=schools, ... band="avg_rad", ... reducer="mean", ... buffer_radius_m=1000, ... start_date="2020-01-01", ... end_date="2020-12-31" ... )
Map to admin zones¶
zones = profiler.map_to_zones( ... gdf=admin_boundaries, ... band="avg_rad", ... reducer="sum" ... )
Source code in gigaspatial/handlers/gee/profiler.py
19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 | |
feature_collection: Union[gpd.GeoDataFrame, ee.FeatureCollection] property writable ¶
Get the current feature collection.
image_collection: Union[ee.ImageCollection, ee.Image] property writable ¶
Get the current image collection.
__init__(dataset_id=None, collection=None, service_account=globalconfig.GOOGLE_SERVICE_ACCOUNT, key_path=globalconfig.GOOGLE_SERVICE_ACCOUNT_KEY_PATH, project_id=globalconfig.GOOGLE_CLOUD_PROJECT, data_store=None, **config_overrides) ¶
Initialize GEE profiler.
Parameters¶
dataset_id : str, optional Dataset ID from built-in registry (e.g., "nightlights", "population"). If provided, automatically loads collection and metadata. collection : str, ee.ImageCollection, or ee.Image, optional Direct GEE collection ID or object. Overrides dataset_id if both provided. service_account : str, optional Google service account email for authentication key_path : str, optional Path to service account JSON key file project_id : str, optional Google Cloud project ID **config_overrides Additional configuration parameters to override defaults (e.g., band="avg_rad", reducer="median", scale=1000)
Examples¶
Use built-in dataset¶
profiler = GEEProfiler(dataset_id="nightlights")
Use custom collection¶
profiler = GEEProfiler(collection="NOAA/VIIRS/DNB/MONTHLY_V1/VCMCFG")
With authentication¶
profiler = GEEProfiler( ... dataset_id="nightlights", ... service_account="my-account@project.iam.gserviceaccount.com", ... key_path="/path/to/key.json" ... )
Source code in gigaspatial/handlers/gee/profiler.py
display_band_names() ¶
Print all available band names.
Examples¶
profiler.display_band_names() Available bands (2): 1. avg_rad 2. cf_cvg
Source code in gigaspatial/handlers/gee/profiler.py
display_collection_info() ¶
Print comprehensive information about the collection.
Examples¶
>>> profiler.display_collection_info()¶
GEE Collection Information¶
Dataset ID: nightlights Collection: NOAA/VIIRS/DNB/MONTHLY_V1/VCMCFG Type: ImageCollection Number of images: 154
Bands: Available bands (2): 1. avg_rad 2. cf_cvg
Temporal coverage: Date range: From: 2012-04-01 To: 2024-12-01
Configuration
Default reducer: mean Temporal cadence: monthly
============================================================
Source code in gigaspatial/handlers/gee/profiler.py
display_date_range(date_format='%Y-%m-%d') ¶
Print the date range of the ImageCollection.
Parameters¶
date_format : str Date format string (default: "%Y-%m-%d")
Examples¶
profiler.display_date_range() Date range: From: 2012-04-01 To: 2024-12-01
Source code in gigaspatial/handlers/gee/profiler.py
display_properties(image_index=0) ¶
Print properties of an image.
Parameters¶
image_index : int Index of image to inspect (default: 0)
Source code in gigaspatial/handlers/gee/profiler.py
get_band_names() ¶
Get all band names from the collection or image.
Returns¶
list List of band names
Examples¶
profiler = GEEProfiler(dataset_id="nightlights") bands = profiler.get_band_names() print(bands) ['avg_rad', 'cf_cvg']
Source code in gigaspatial/handlers/gee/profiler.py
get_date_range(date_format='%Y-%m-%d') ¶
Get the date range of an ImageCollection.
Parameters¶
date_format : str Date format string (default: "%Y-%m-%d")
Returns¶
dict Dictionary with 'min' and 'max' dates
Raises¶
TypeError If collection is an Image (not ImageCollection)
Examples¶
date_range = profiler.get_date_range() print(date_range)
Source code in gigaspatial/handlers/gee/profiler.py
get_properties(image_index=0) ¶
Get properties of an image from the collection.
Parameters¶
image_index : int Index of image to inspect (default: 0 for first image)
Returns¶
dict Image properties
Source code in gigaspatial/handlers/gee/profiler.py
map_to_points(gdf, band=None, reducer=None, buffer_radius_m=None, start_date=None, end_date=None, temporal_reducer=None, chunk_size=None, scale=None) ¶
Map GEE values to point locations with optional circular buffers.
This method extracts values from the Earth Engine image/collection at point locations. If buffer_radius_m is provided, circular buffers are created around points and spatial reduction is applied within each buffer.
Parameters¶
gdf : GeoDataFrame Point locations to enrich (must have Point geometries) band : str, optional Band to extract (uses config default if None) reducer : str, optional Spatial reducer: mean, median, min, max, sum, etc. (uses config default if None) buffer_radius_m : float, optional Buffer radius in meters around each point. If None or 0, point values are extracted directly. start_date : str, optional Start date YYYY-MM-DD for temporal filtering (uses config if None) end_date : str, optional End date YYYY-MM-DD for temporal filtering (uses config if None) temporal_reducer : str, optional How to aggregate over time if multiple images: mean, median, max, etc. (uses config default if None) chunk_size : int, optional Features per chunk for API rate limiting (uses config default if None) scale : float, optional Spatial resolution in meters (uses config/dataset default if None)
Returns¶
GeoDataFrame Enriched GeoDataFrame with new column: {band}_{reducer}
Examples¶
Extract nightlight values at school points with 1km buffers¶
profiler = GEEProfiler(dataset_id="nightlights") enriched = profiler.map_to_points( ... gdf=schools, ... band="avg_rad", ... reducer="mean", ... buffer_radius_m=1000, ... start_date="2020-01-01", ... end_date="2020-12-31" ... ) print(enriched[["school_id", "avg_rad_mean"]].head())
Source code in gigaspatial/handlers/gee/profiler.py
778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 | |
map_to_zones(gdf, band=None, reducer=None, start_date=None, end_date=None, temporal_reducer=None, chunk_size=None, scale=None) ¶
Map GEE values to polygon zones with spatial aggregation.
This method aggregates values from the Earth Engine image/collection within polygon boundaries (e.g., administrative zones, grid cells).
Parameters¶
gdf : GeoDataFrame Polygon zones to aggregate over band : str, optional Band to extract (uses config default if None) reducer : str, optional Spatial reducer: mean, median, min, max, sum, etc. (uses config default if None) start_date : str, optional Start date YYYY-MM-DD for temporal filtering end_date : str, optional End date YYYY-MM-DD for temporal filtering temporal_reducer : str, optional How to aggregate over time: mean, median, max, etc. chunk_size : int, optional Features per chunk for API rate limiting scale : float, optional Spatial resolution in meters
Returns¶
GeoDataFrame Enriched GeoDataFrame with new column: {band}_{reducer}
Examples¶
Aggregate population within admin boundaries¶
profiler = GEEProfiler(dataset_id="population") zones_enriched = profiler.map_to_zones( ... gdf=admin_boundaries, ... band="population_density", ... reducer="sum", ... start_date="2020-01-01", ... end_date="2020-12-31" ... ) print(zones_enriched[["admin_name", "population_density_sum"]].head())
Source code in gigaspatial/handlers/gee/profiler.py
912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 | |
validate_band(band_name) ¶
validate_date_range(start_date, end_date) ¶
Check if requested dates are within collection's date range.
Parameters¶
start_date : str Start date in YYYY-MM-DD format end_date : str End date in YYYY-MM-DD format
Returns¶
bool True if dates are valid
Source code in gigaspatial/handlers/gee/profiler.py
registry ¶
GEEDatasetRegistry ¶
Manages GEE dataset registry with built-in and custom datasets.
Automatically loads built-in datasets and allows users to add custom ones.
Source code in gigaspatial/handlers/gee/registry.py
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 | |
__contains__(dataset_id) ¶
__init__(custom_registry=None) ¶
Initialize registry with built-in datasets.
Parameters¶
custom_registry : dict, optional User-provided custom dataset registry to merge with built-in
Source code in gigaspatial/handlers/gee/registry.py
__len__() ¶
add_custom_datasets(custom_datasets) ¶
Add or override datasets with custom definitions.
Parameters¶
custom_datasets : dict Dictionary of dataset_id -> GEEDatasetEntry or dict
Source code in gigaspatial/handlers/gee/registry.py
get(dataset_id) ¶
Get dataset entry by ID.
Parameters¶
dataset_id : str Dataset identifier
Returns¶
GEEDatasetEntry Dataset metadata entry
Source code in gigaspatial/handlers/gee/registry.py
get_datasets_by_cadence(cadence) ¶
Get datasets by temporal cadence.
get_datasets_by_source(source) ¶
Get all datasets from a specific source (e.g., 'NOAA', 'NASA').
Source code in gigaspatial/handlers/gee/registry.py
list_datasets() ¶
load_from_json(filepath) classmethod ¶
Load registry from JSON file.
save_to_json(filepath) ¶
search(keyword) ¶
Search datasets by keyword in name or description.
Parameters¶
keyword : str Search keyword
Returns¶
list Matching dataset IDs
Source code in gigaspatial/handlers/gee/registry.py
ghsl ¶
CoordSystem ¶
GHSLDataConfig dataclass ¶
Bases: BaseHandlerConfig
Source code in gigaspatial/handlers/ghsl.py
44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 | |
__repr__() ¶
Return a string representation of the GHSL dataset configuration.
Source code in gigaspatial/handlers/ghsl.py
compute_dataset_url(tile_id=None) ¶
Compute the download URL for a GHSL dataset.
Source code in gigaspatial/handlers/ghsl.py
extract_search_geometry(source, **kwargs) ¶
Extract a canonical search geometry for GHSL and normalize it to the GHSL grid CRS.
Parameters¶
source : Any supported source type from BaseHandlerConfig: country code (str), GeoDataFrame, shapely geometry, or iterable of points. crs : str, optional CRS of the input coordinates when the source itself does not carry CRS information (e.g. bare shapely geometry or list of points). For GeoDataFrames, source.crs is preferred.
Returns¶
shapely.geometry.base.BaseGeometry Geometry normalized to the GHSL CRS (self.crs).
Source code in gigaspatial/handlers/ghsl.py
get_data_unit_path(unit=None, file_ext='.zip', **kwargs) ¶
Construct and return the path for the configured dataset or dataset tile.
Source code in gigaspatial/handlers/ghsl.py
get_relevant_data_units_by_geometry(geometry, **kwargs) ¶
Return intersecting tiles for a given geometry or GeoDataFrame.
Notes¶
- The input
geometryis expected to already be in the GHSL CRS (see :meth:extract_search_geometry), i.e.self.crs. - Callers should normally use :meth:
get_relevant_data_units, which routes throughextract_search_geometryand ensures CRS normalization, instead of calling this method directly.
Source code in gigaspatial/handlers/ghsl.py
validate_configuration() ¶
Validate that the configuration is valid based on dataset availability constraints.
Specific rules:¶
Source code in gigaspatial/handlers/ghsl.py
GHSLDataDownloader ¶
Bases: BaseHandlerDownloader
A class to handle downloads of GHSL datasets.
Source code in gigaspatial/handlers/ghsl.py
376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 | |
__init__(config, data_store=None, logger=None) ¶
Initialize the downloader.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config | Union[GHSLDataConfig, dict[str, Union[str, int]]] | Configuration for the GHSL dataset, either as a GHSLDataConfig object or a dictionary of parameters | required |
data_store | Optional[DataStore] | Optional data storage interface. If not provided, uses LocalDataStore. | None |
logger | Optional[Logger] | Optional custom logger. If not provided, uses default logger. | None |
Source code in gigaspatial/handlers/ghsl.py
download(source, extract=True, file_pattern='.*\\.tif$', **kwargs) ¶
Download GHSL data for a specified geographic region.
The region can be defined by a country code/name, a list of points, a Shapely geometry, or a GeoDataFrame. This method identifies the relevant GHSL tiles intersecting the region and downloads the specified type of data (polygons or points) for those tiles in parallel.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source | Union[str, List[Union[Tuple[float, float], Point]], BaseGeometry, GeoDataFrame] | Defines the geographic area for which to download data. Can be: - A string representing a country code or name. - A list of (latitude, longitude) tuples or Shapely Point objects. - A Shapely BaseGeometry object (e.g., Polygon, MultiPolygon). - A GeoDataFrame with geometry column in EPSG:4326. | required |
extract | bool | If True and the downloaded files are zips, extract their contents. Defaults to True. | True |
file_pattern | Optional[str] | Optional regex pattern to filter extracted files (if extract=True). | '.*\\.tif$' |
**kwargs | Additional keyword arguments. These will be passed down to | {} |
Returns:
| Type | Description |
|---|---|
List[Optional[Union[Path, List[Path]]]] | A list of local file paths for the successfully downloaded tiles. |
List[Optional[Union[Path, List[Path]]]] | Returns an empty list if no data is found for the region or if |
List[Optional[Union[Path, List[Path]]]] | all downloads fail. |
Source code in gigaspatial/handlers/ghsl.py
download_by_country(country_code, data_store=None, country_geom_path=None, extract=True, file_pattern='.*\\.tif$', **kwargs) ¶
Download GHSL data for a specific country.
This is a convenience method to download data for an entire country using its code or name.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
country_code | str | The country code (e.g., 'USA', 'GBR') or name. | required |
data_store | Optional[DataStore] | Optional instance of a | None |
country_geom_path | Optional[Union[str, Path]] | Optional path to a GeoJSON file containing the country boundary. If provided, this boundary is used instead of the default from | None |
extract | bool | If True and the downloaded files are zips, extract their contents. Defaults to True. | True |
file_pattern | Optional[str] | Optional regex pattern to filter extracted files (if extract=True). | '.*\\.tif$' |
**kwargs | Additional keyword arguments that are passed to | {} |
Returns:
| Type | Description |
|---|---|
List[Optional[Union[Path, List[Path]]]] | A list of local file paths for the successfully downloaded tiles |
List[Optional[Union[Path, List[Path]]]] | for the specified country. |
Source code in gigaspatial/handlers/ghsl.py
download_data_unit(tile_id, extract=True, file_pattern='.*\\.tif$', **kwargs) ¶
Downloads and optionally extracts files for a given tile.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
tile_id | str | tile ID to process. | required |
extract | bool | If True and the downloaded file is a zip, extract its contents. Defaults to True. | True |
file_pattern | Optional[str] | Optional regex pattern to filter extracted files (if extract=True). | '.*\\.tif$' |
**kwargs | Additional parameters passed to download methods | {} |
Returns:
| Type | Description |
|---|---|
Optional[Union[Path, List[Path]]] | Path to the downloaded file if extract=False, |
Optional[Union[Path, List[Path]]] | List of paths to the extracted files if extract=True, |
Optional[Union[Path, List[Path]]] | None on failure. |
Source code in gigaspatial/handlers/ghsl.py
398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 | |
download_data_units(tile_ids, extract=True, file_pattern='.*\\.tif$', **kwargs) ¶
Downloads multiple tiles in parallel, with an option to extract them.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
tile_ids | List[str] | A list of tile IDs to download. | required |
extract | bool | If True and the downloaded files are zips, extract their contents. Defaults to True. | True |
file_pattern | Optional[str] | Optional regex pattern to filter extracted files (if extract=True). | '.*\\.tif$' |
**kwargs | Additional parameters passed to download methods | {} |
Returns:
| Type | Description |
|---|---|
List[Optional[Union[Path, List[Path]]]] | A list where each element corresponds to a tile ID and contains: |
List[Optional[Union[Path, List[Path]]]] |
|
List[Optional[Union[Path, List[Path]]]] |
|
List[Optional[Union[Path, List[Path]]]] |
|
Source code in gigaspatial/handlers/ghsl.py
GHSLDataHandler ¶
Bases: BaseHandler
Handler for GHSL (Global Human Settlement Layer) dataset.
This class provides a unified interface for downloading and loading GHSL data. It manages the lifecycle of configuration, downloading, and reading components.
Source code in gigaspatial/handlers/ghsl.py
716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 | |
__init__(product, year=2020, resolution=100, config=None, downloader=None, reader=None, data_store=None, logger=None, **kwargs) ¶
Initialize the GHSLDataHandler.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
product | Literal['GHS_BUILT_S', 'GHS_BUILT_H_AGBH', 'GHS_BUILT_H_ANBH', 'GHS_BUILT_V', 'GHS_POP', 'GHS_SMOD'] | The GHSL product to use. Must be one of: - GHS_BUILT_S: Built-up surface - GHS_BUILT_H_AGBH: Average building height - GHS_BUILT_H_ANBH: Average number of building heights - GHS_BUILT_V: Building volume - GHS_POP: Population - GHS_SMOD: Settlement model | required |
year | int | The year of the data (default: 2020) | 2020 |
resolution | int | The resolution in meters (default: 100) | 100 |
config | Optional[GHSLDataConfig] | Optional configuration object | None |
downloader | Optional[GHSLDataDownloader] | Optional downloader instance | None |
reader | Optional[GHSLDataReader] | Optional reader instance | None |
data_store | Optional[DataStore] | Optional data store instance | None |
logger | Optional[Logger] | Optional logger instance | None |
**kwargs | Additional configuration parameters | {} |
Source code in gigaspatial/handlers/ghsl.py
create_config(data_store, logger, **kwargs) ¶
Create and return a GHSLDataConfig instance.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data_store | DataStore | The data store instance to use | required |
logger | Logger | The logger instance to use | required |
**kwargs | Additional configuration parameters | {} |
Returns:
| Type | Description |
|---|---|
GHSLDataConfig | Configured GHSLDataConfig instance |
Source code in gigaspatial/handlers/ghsl.py
create_downloader(config, data_store, logger, **kwargs) ¶
Create and return a GHSLDataDownloader instance.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config | GHSLDataConfig | The configuration object | required |
data_store | DataStore | The data store instance to use | required |
logger | Logger | The logger instance to use | required |
**kwargs | Additional downloader parameters | {} |
Returns:
| Type | Description |
|---|---|
GHSLDataDownloader | Configured GHSLDataDownloader instance |
Source code in gigaspatial/handlers/ghsl.py
create_reader(config, data_store, logger, **kwargs) ¶
Create and return a GHSLDataReader instance.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config | GHSLDataConfig | The configuration object | required |
data_store | DataStore | The data store instance to use | required |
logger | Logger | The logger instance to use | required |
**kwargs | Additional reader parameters | {} |
Returns:
| Type | Description |
|---|---|
GHSLDataReader | Configured GHSLDataReader instance |
Source code in gigaspatial/handlers/ghsl.py
load_into_dataframe(source, crop_to_source=False, ensure_available=True, **kwargs) ¶
Load GHSL data into a pandas DataFrame.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source | Union[str, List[Union[tuple, Point]], BaseGeometry, GeoDataFrame, Path, List[Union[str, Path]]] | The data source specification | required |
ensure_available | bool | If True, ensure data is downloaded before loading | True |
**kwargs | Additional parameters passed to load methods | {} |
Returns:
| Type | Description |
|---|---|
DataFrame | DataFrame containing the GHSL data |
Source code in gigaspatial/handlers/ghsl.py
load_into_geodataframe(source, crop_to_source=False, ensure_available=True, **kwargs) ¶
Load GHSL data into a geopandas GeoDataFrame.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source | Union[str, List[Union[tuple, Point]], BaseGeometry, GeoDataFrame, Path, List[Union[str, Path]]] | The data source specification | required |
ensure_available | bool | If True, ensure data is downloaded before loading | True |
**kwargs | Additional parameters passed to load methods | {} |
Returns:
| Type | Description |
|---|---|
GeoDataFrame | GeoDataFrame containing the GHSL data |
Source code in gigaspatial/handlers/ghsl.py
GHSLDataReader ¶
Bases: BaseHandlerReader
Source code in gigaspatial/handlers/ghsl.py
__init__(config, data_store=None, logger=None) ¶
Initialize the reader.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config | Union[GHSLDataConfig, dict[str, Union[str, int]]] | Configuration for the GHSL dataset, either as a GHSLDataConfig object or a dictionary of parameters | required |
data_store | Optional[DataStore] | Optional data storage interface. If not provided, uses LocalDataStore. | None |
logger | Optional[Logger] | Optional custom logger. If not provided, uses default logger. | None |
Source code in gigaspatial/handlers/ghsl.py
load_from_paths(source_data_path, merge_rasters=False, **kwargs) ¶
Load TifProcessors from GHSL dataset. Args: source_data_path: List of file paths to load merge_rasters: If True, all rasters will be merged into a single TifProcessor. Defaults to False. Returns: Union[List[TifProcessor], TifProcessor]: List of TifProcessor objects for accessing the raster data or a single TifProcessor if merge_rasters is True.
Source code in gigaspatial/handlers/ghsl.py
giga ¶
GigaSchoolLocationFetcher ¶
Fetch and process school location data from the Giga School Geolocation Data API.
Source code in gigaspatial/handlers/giga.py
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 | |
fetch_locations(process_geospatial=False, **kwargs) ¶
Fetch and process school locations.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
process_geospatial | bool | Whether to process geospatial data and return a GeoDataFrame. Defaults to False. | False |
**kwargs | Additional parameters for customization - page_size: Override default page size - sleep_time: Override default sleep time between requests - max_pages: Limit the number of pages to fetch | {} |
Returns:
| Type | Description |
|---|---|
Union[DataFrame, GeoDataFrame] | pd.DataFrame: School locations with geospatial info. |
Source code in gigaspatial/handlers/giga.py
GigaSchoolMeasurementsFetcher ¶
Fetch and process school daily realtime connectivity measurements from the Giga API. This includes download/upload speeds, latency, and connectivity performance data.
Source code in gigaspatial/handlers/giga.py
360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 | |
fetch_measurements(**kwargs) ¶
Fetch and process school connectivity measurements.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
**kwargs | Additional parameters for customization - page_size: Override default page size - sleep_time: Override default sleep time between requests - max_pages: Limit the number of pages to fetch - giga_id_school: Override default giga_id_school filter - start_date: Override default start_date - end_date: Override default end_date | {} |
Returns:
| Type | Description |
|---|---|
DataFrame | pd.DataFrame: School measurements with connectivity performance data. |
Source code in gigaspatial/handlers/giga.py
430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 | |
get_performance_summary(df) ¶
Generate a comprehensive summary of connectivity performance metrics.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df | DataFrame | DataFrame with measurement data | required |
Returns:
| Name | Type | Description |
|---|---|---|
dict | dict | Summary statistics about connectivity performance |
Source code in gigaspatial/handlers/giga.py
636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 | |
get_school_performance_comparison(df, top_n=10) ¶
Compare performance across schools.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df | DataFrame | DataFrame with measurement data | required |
top_n | int | Number of top/bottom schools to include | 10 |
Returns:
| Name | Type | Description |
|---|---|---|
dict | dict | School performance comparison |
Source code in gigaspatial/handlers/giga.py
GigaSchoolProfileFetcher ¶
Fetch and process school profile data from the Giga School Profile API. This includes connectivity information and other school details.
Source code in gigaspatial/handlers/giga.py
155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 | |
fetch_profiles(**kwargs) ¶
Fetch and process school profiles including connectivity information.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
**kwargs | Additional parameters for customization - page_size: Override default page size - sleep_time: Override default sleep time between requests - max_pages: Limit the number of pages to fetch - giga_id_school: Override default giga_id_school filter | {} |
Returns:
| Type | Description |
|---|---|
DataFrame | pd.DataFrame: School profiles with connectivity and geospatial info. |
Source code in gigaspatial/handlers/giga.py
187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 | |
get_connectivity_summary(df) ¶
Generate a summary of connectivity statistics from the fetched data.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df | DataFrame | DataFrame with school profile data | required |
Returns:
| Name | Type | Description |
|---|---|---|
dict | dict | Summary statistics about connectivity |
Source code in gigaspatial/handlers/giga.py
google_open_buildings ¶
GoogleOpenBuildingsConfig dataclass ¶
Bases: BaseHandlerConfig
Configuration for Google Open Buildings dataset files. Implements the BaseHandlerConfig interface for data unit resolution.
Source code in gigaspatial/handlers/google_open_buildings.py
get_data_unit_path(unit, data_type='polygons', **kwargs) ¶
Given a tile row or tile_id, return the corresponding file path.
Source code in gigaspatial/handlers/google_open_buildings.py
get_data_unit_paths(units, data_type='polygons', **kwargs) ¶
Given data unit identifiers, return the corresponding file paths.
Source code in gigaspatial/handlers/google_open_buildings.py
get_relevant_data_units_by_geometry(geometry, **kwargs) ¶
Return intersecting tiles for a given geometry or GeoDataFrame.
Source code in gigaspatial/handlers/google_open_buildings.py
GoogleOpenBuildingsDownloader ¶
Bases: BaseHandlerDownloader
A class to handle downloads of Google's Open Buildings dataset.
Source code in gigaspatial/handlers/google_open_buildings.py
98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 | |
__init__(config=None, data_store=None, logger=None) ¶
Initialize the downloader.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config | Optional[GoogleOpenBuildingsConfig] | Optional configuration for file paths and download settings. If None, a default | None |
data_store | Optional[DataStore] | Optional instance of a | None |
logger | Optional[Logger] | Optional custom logger instance. If None, a default logger named after the module is created and used. | None |
Source code in gigaspatial/handlers/google_open_buildings.py
download_by_country(country, data_type='polygons', data_store=None, country_geom_path=None) ¶
Download Google Open Buildings data for a specific country.
This is a convenience method to download data for an entire country using its code or name.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
country | str | The country code (e.g., 'USA', 'GBR') or name. | required |
data_type | Literal['polygons', 'points'] | The type of building data to download ('polygons' or 'points'). Defaults to 'polygons'. | 'polygons' |
data_store | Optional[DataStore] | Optional instance of a | None |
country_geom_path | Optional[Union[str, Path]] | Optional path to a GeoJSON file containing the country boundary. If provided, this boundary is used instead of the default from | None |
Returns:
| Type | Description |
|---|---|
List[str] | A list of local file paths for the successfully downloaded tiles |
List[str] | for the specified country. |
Source code in gigaspatial/handlers/google_open_buildings.py
download_data_unit(tile_info, data_type='polygons') ¶
Download data file for a single tile.
The type of building data to download ('polygons' or 'points').
Defaults to 'polygons'.
Source code in gigaspatial/handlers/google_open_buildings.py
download_data_units(tiles, data_type='polygons') ¶
Download data files for multiple tiles.
The type of building data to download ('polygons' or 'points').
Defaults to 'polygons'.
Source code in gigaspatial/handlers/google_open_buildings.py
GoogleOpenBuildingsHandler ¶
Bases: BaseHandler
Handler for Google Open Buildings dataset.
This class provides a unified interface for downloading and loading Google Open Buildings data. It manages the lifecycle of configuration, downloading, and reading components.
Source code in gigaspatial/handlers/google_open_buildings.py
285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 | |
create_config(data_store, logger, **kwargs) ¶
Create and return a GoogleOpenBuildingsConfig instance.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data_store | DataStore | The data store instance to use | required |
logger | Logger | The logger instance to use | required |
**kwargs | Additional configuration parameters | {} |
Returns:
| Type | Description |
|---|---|
GoogleOpenBuildingsConfig | Configured GoogleOpenBuildingsConfig instance |
Source code in gigaspatial/handlers/google_open_buildings.py
create_downloader(config, data_store, logger, **kwargs) ¶
Create and return a GoogleOpenBuildingsDownloader instance.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config | GoogleOpenBuildingsConfig | The configuration object | required |
data_store | DataStore | The data store instance to use | required |
logger | Logger | The logger instance to use | required |
**kwargs | Additional downloader parameters | {} |
Returns:
| Type | Description |
|---|---|
GoogleOpenBuildingsDownloader | Configured GoogleOpenBuildingsDownloader instance |
Source code in gigaspatial/handlers/google_open_buildings.py
create_reader(config, data_store, logger, **kwargs) ¶
Create and return a GoogleOpenBuildingsReader instance.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config | GoogleOpenBuildingsConfig | The configuration object | required |
data_store | DataStore | The data store instance to use | required |
logger | Logger | The logger instance to use | required |
**kwargs | Additional reader parameters | {} |
Returns:
| Type | Description |
|---|---|
GoogleOpenBuildingsReader | Configured GoogleOpenBuildingsReader instance |
Source code in gigaspatial/handlers/google_open_buildings.py
load_points(source, crop_to_source=False, ensure_available=True, **kwargs) ¶
Load point data from Google Open Buildings dataset.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source | Union[str, List[Union[tuple, Point]], BaseGeometry, GeoDataFrame, Path, List[Union[str, Path]]] | The data source specification | required |
ensure_available | bool | If True, ensure data is downloaded before loading | True |
**kwargs | Additional parameters passed to load methods | {} |
Returns:
| Type | Description |
|---|---|
GeoDataFrame | GeoDataFrame containing building point data |
Source code in gigaspatial/handlers/google_open_buildings.py
load_polygons(source, crop_to_source=False, ensure_available=True, **kwargs) ¶
Load polygon data from Google Open Buildings dataset.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source | Union[str, List[Union[tuple, Point]], BaseGeometry, GeoDataFrame, Path, List[Union[str, Path]]] | The data source specification | required |
ensure_available | bool | If True, ensure data is downloaded before loading | True |
**kwargs | Additional parameters passed to load methods | {} |
Returns:
| Type | Description |
|---|---|
GeoDataFrame | GeoDataFrame containing building polygon data |
Source code in gigaspatial/handlers/google_open_buildings.py
GoogleOpenBuildingsReader ¶
Bases: BaseHandlerReader
Reader for Google Open Buildings data, supporting country, points, and geometry-based resolution.
Source code in gigaspatial/handlers/google_open_buildings.py
load_from_paths(source_data_path, **kwargs) ¶
Load building data from Google Open Buildings dataset. Args: source_data_path: List of file paths to load Returns: GeoDataFrame containing building data
Source code in gigaspatial/handlers/google_open_buildings.py
load_points(source, crop_to_source=False, **kwargs) ¶
This is a convenience method to load points data
Source code in gigaspatial/handlers/google_open_buildings.py
load_polygons(source, crop_to_source=False, **kwargs) ¶
This is a convenience method to load polygons data
Source code in gigaspatial/handlers/google_open_buildings.py
hdx ¶
HDXConfig dataclass ¶
Bases: BaseHandlerConfig
Configuration for HDX data access
Source code in gigaspatial/handlers/hdx.py
29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 | |
output_dir_path: Path property ¶
Path to save the downloaded HDX dataset
configure_hdx() ¶
Configure HDX API if not already configured
Source code in gigaspatial/handlers/hdx.py
extract_search_geometry(source, **kwargs) ¶
Override the base class method since geometry extraction does not apply. Returns dictionary to filter.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source | Either a country name/code (str) or a filter dictionary | required | |
**kwargs | Additional keyword arguments passed to the specific method | {} |
Source code in gigaspatial/handlers/hdx.py
fetch_dataset() ¶
Get the HDX dataset
Source code in gigaspatial/handlers/hdx.py
get_data_unit_path(unit, **kwargs) ¶
Get the path for a data unit
get_dataset_resources(filter=None, exact_match=False) ¶
Get resources from the HDX dataset
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filter | Optional[Dict[str, Any]] | Dictionary of key-value pairs to filter resources | None |
exact_match | bool | If True, perform exact matching. If False, use pattern matching | False |
Source code in gigaspatial/handlers/hdx.py
list_resources() ¶
List all resources in the dataset directory using the data_store.
Source code in gigaspatial/handlers/hdx.py
search_datasets(query, rows=None, sort='relevance asc, metadata_modified desc', hdx_site='prod', user_agent='gigaspatial') staticmethod ¶
Search for datasets in HDX before initializing the class.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
query | str | Search query string | required |
rows | int | Number of results per page. Defaults to all datasets (sys.maxsize). | None |
sort | str | Sort order - one of 'relevance', 'views_recent', 'views_total', 'last_modified' (default: 'relevance') | 'relevance asc, metadata_modified desc' |
hdx_site | str | HDX site to use - 'prod' or 'test' (default: 'prod') | 'prod' |
user_agent | str | User agent for HDX API requests (default: 'gigaspatial') | 'gigaspatial' |
Returns:
| Type | Description |
|---|---|
List[Dict] | List of dataset dictionaries containing search results |
Example
results = HDXConfig.search_datasets("population", rows=5) for dataset in results: print(f"Name: {dataset['name']}, Title: {dataset['title']}")
Source code in gigaspatial/handlers/hdx.py
HDXDownloader ¶
Bases: BaseHandlerDownloader
Downloader for HDX datasets
Source code in gigaspatial/handlers/hdx.py
download_data_unit(resource, **kwargs) ¶
Download a single resource
Source code in gigaspatial/handlers/hdx.py
download_data_units(resources, **kwargs) ¶
Download multiple resources sequentially
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
resources | List[Resource] | List of HDX Resource objects | required |
**kwargs | Additional keyword arguments | {} |
Returns:
| Type | Description |
|---|---|
List[str] | List of paths to downloaded files |
Source code in gigaspatial/handlers/hdx.py
HDXHandler ¶
Bases: BaseHandler
Handler for HDX datasets
Source code in gigaspatial/handlers/hdx.py
create_config(data_store, logger, **kwargs) ¶
Create and return a HDXConfig instance
Source code in gigaspatial/handlers/hdx.py
create_downloader(config, data_store, logger, **kwargs) ¶
Create and return a HDXDownloader instance
Source code in gigaspatial/handlers/hdx.py
create_reader(config, data_store, logger, **kwargs) ¶
Create and return a HDXReader instance
Source code in gigaspatial/handlers/hdx.py
HDXReader ¶
Bases: BaseHandlerReader
Reader for HDX datasets
Source code in gigaspatial/handlers/hdx.py
load_from_paths(source_data_path, **kwargs) ¶
Load data from paths
Source code in gigaspatial/handlers/hdx.py
healthsites ¶
HealthSitesFetcher ¶
Fetch and process health facility location data from the Healthsites.io API.
Source code in gigaspatial/handlers/healthsites.py
14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 | |
fetch_facilities(**kwargs) ¶
Fetch and process health facility locations.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
**kwargs | Additional parameters for customization - country: Override country filter - extent: Override extent filter - from_date: Get data modified from this timestamp (datetime or string) - to_date: Get data modified to this timestamp (datetime or string) - page_size: Override default page size - sleep_time: Override default sleep time between requests - max_pages: Limit the number of pages to fetch - output_format: Override output format ('json' or 'geojson') - flat_properties: Override flat properties setting | {} |
Returns:
| Type | Description |
|---|---|
Union[DataFrame, GeoDataFrame] | Union[pd.DataFrame, gpd.GeoDataFrame]: Health facilities data. Returns GeoDataFrame for geojson format, DataFrame for json format. |
Source code in gigaspatial/handlers/healthsites.py
47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 | |
fetch_facility_by_id(osm_type, osm_id) ¶
Fetch a specific facility by OSM type and ID.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
osm_type | str | OSM type (node, way, relation) | required |
osm_id | str | OSM ID | required |
Returns:
| Name | Type | Description |
|---|---|---|
dict | dict | Facility details |
Source code in gigaspatial/handlers/healthsites.py
fetch_statistics(**kwargs) ¶
Fetch statistics for health facilities.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
**kwargs | Same filtering parameters as fetch_facilities | {} |
Returns:
| Name | Type | Description |
|---|---|---|
dict | dict | Statistics data |
Source code in gigaspatial/handlers/healthsites.py
mapbox_image ¶
MapboxImageDownloader ¶
Class to download images from Mapbox Static Images API using a specific style
Source code in gigaspatial/handlers/mapbox_image.py
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 | |
__init__(access_token=config.MAPBOX_ACCESS_TOKEN, style_id=None, data_store=None) ¶
Initialize the downloader with Mapbox credentials
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
access_token | str | Mapbox access token | MAPBOX_ACCESS_TOKEN |
style_id | Optional[str] | Mapbox style ID to use for image download | None |
data_store | Optional[DataStore] | Instance of DataStore for accessing data storage | None |
Source code in gigaspatial/handlers/mapbox_image.py
download_images_by_bounds(gdf, output_dir, image_size=(512, 512), max_workers=4, image_prefix='image_') ¶
Download images for given points using the specified style
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
gdf_points | GeoDataFrame containing bounding box polygons | required | |
output_dir | Union[str, Path] | Directory to save images | required |
image_size | Tuple[int, int] | Tuple of (width, height) for output images | (512, 512) |
max_workers | int | Maximum number of concurrent downloads | 4 |
image_prefix | str | Prefix for output image names | 'image_' |
Source code in gigaspatial/handlers/mapbox_image.py
download_images_by_coordinates(data, res_meters_pixel, output_dir, image_size=(512, 512), max_workers=4, image_prefix='image_') ¶
Download images for given coordinates by creating bounded boxes around points
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data | Union[DataFrame, List[Tuple[float, float]]] | Either a DataFrame with either latitude/longitude columns or a geometry column or a list of (lat, lon) tuples | required |
res_meters_pixel | float | Size of the bounding box in meters (creates a square) | required |
output_dir | Union[str, Path] | Directory to save images | required |
image_size | Tuple[int, int] | Tuple of (width, height) for output images | (512, 512) |
max_workers | int | Maximum number of concurrent downloads | 4 |
image_prefix | str | Prefix for output image names | 'image_' |
Source code in gigaspatial/handlers/mapbox_image.py
download_images_by_tiles(mercator_tiles, output_dir, image_size=(512, 512), max_workers=4, image_prefix='image_') ¶
Download images for given mercator tiles using the specified style
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
mercator_tiles | MercatorTiles | MercatorTiles instance containing quadkeys | required |
output_dir | Union[str, Path] | Directory to save images | required |
image_size | Tuple[int, int] | Tuple of (width, height) for output images | (512, 512) |
max_workers | int | Maximum number of concurrent downloads | 4 |
image_prefix | str | Prefix for output image names | 'image_' |
Source code in gigaspatial/handlers/mapbox_image.py
maxar_image ¶
MaxarConfig ¶
Bases: BaseModel
Configuration for Maxar Image Downloader using Pydantic
Source code in gigaspatial/handlers/maxar_image.py
MaxarImageDownloader ¶
Source code in gigaspatial/handlers/maxar_image.py
99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 | |
__init__(config=None, data_store=None, **kwargs) ¶
Initialize the downloader with Maxar config.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config | Optional[Union[MaxarConfig, dict]] | MaxarConfig instance, dict with config values, or None | None |
data_store | Optional[DataStore] | Instance of DataStore for accessing data storage | None |
**kwargs | Individual config parameters (overrides config dict/object) | {} |
Examples:
Using config object¶
downloader = MaxarImageDownloader(config=MaxarConfig(api_key="..."))
Using dict¶
downloader = MaxarImageDownloader(config={"api_key": "...", "transparent": False})
Using kwargs¶
downloader = MaxarImageDownloader(api_key="...", image_format="image/jpeg")
Mixing dict and kwargs (kwargs take precedence)¶
downloader = MaxarImageDownloader( config={"api_key": "..."}, transparent=False )
Source code in gigaspatial/handlers/maxar_image.py
build_date_filter(start_date=None, end_date=None, date_field='acquisitionDate') ¶
Build a CQL date filter string
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
start_date | Optional[Union[str, datetime, date]] | Start date (inclusive). Accepts: - String: "2020-01-01", "2020-01-01T10:30:00" - datetime/date object | None |
end_date | Optional[Union[str, datetime, date]] | End date (inclusive). Same formats as start_date | None |
date_field | str | Field name to filter on (default: "acquisitionDate") Other options: "createdDate", "latestAcquisitionTime", etc. | 'acquisitionDate' |
Returns:
| Type | Description |
|---|---|
str | CQL filter string for date range |
Examples:
Last 30 days¶
filter = downloader.build_date_filter( start_date="2024-01-01", end_date="2024-01-31" )
Only start date (everything after)¶
filter = downloader.build_date_filter(start_date="2024-01-01")
Only end date (everything before)¶
filter = downloader.build_date_filter(end_date="2024-01-31")
Source code in gigaspatial/handlers/maxar_image.py
download_images_by_bounds(gdf, output_dir, image_size=(512, 512), image_prefix='maxar_image_', save_metadata=False, start_date=None, end_date=None) ¶
Download images for given bounding box polygons
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
gdf | GeoDataFrame | GeoDataFrame containing bounding box polygons | required |
output_dir | Union[str, Path] | Directory to save images | required |
image_size | Tuple[int, int] | Tuple of (width, height) for output images | (512, 512) |
image_prefix | str | Prefix for output image names | 'maxar_image_' |
save_metadata | bool | If True, saves WFS metadata as JSON alongside each image | False |
start_date | Optional[Union[str, datetime, date]] | Filter images acquired on or after this date (YYYY-MM-DD) | None |
end_date | Optional[Union[str, datetime, date]] | Filter images acquired on or before this date (YYYY-MM-DD) | None |
Source code in gigaspatial/handlers/maxar_image.py
424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 | |
download_images_by_coordinates(data, res_meters_pixel, output_dir, image_size=(512, 512), image_prefix='maxar_image_', save_metadata=False, start_date=None, end_date=None) ¶
Download images for given coordinates by creating bounding boxes around points
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
res_meters_pixel | float | Resolution in meters per pixel | required |
output_dir | Union[str, Path] | Directory to save images | required |
image_size | Tuple[int, int] | Tuple of (width, height) for output images | (512, 512) |
image_prefix | str | Prefix for output image names | 'maxar_image_' |
save_metadata | bool | If True, saves WFS metadata as JSON alongside each image | False |
start_date | Optional[Union[str, datetime, date]] | Filter images acquired on or after this date (YYYY-MM-DD) | None |
end_date | Optional[Union[str, datetime, date]] | Filter images acquired on or before this date (YYYY-MM-DD) | None |
Source code in gigaspatial/handlers/maxar_image.py
download_images_by_tiles(mercator_tiles, output_dir, image_size=(512, 512), image_prefix='maxar_image_', save_metadata=False, start_date=None, end_date=None) ¶
Download images for given mercator tiles
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
mercator_tiles | MercatorTiles | MercatorTiles instance containing quadkeys | required |
output_dir | Union[str, Path] | Directory to save images | required |
image_size | Tuple[int, int] | Tuple of (width, height) for output images | (512, 512) |
image_prefix | str | Prefix for output image names | 'maxar_image_' |
save_metadata | bool | If True, saves WFS metadata as JSON alongside each image | False |
start_date | Optional[Union[str, datetime, date]] | Filter images acquired on or after this date (YYYY-MM-DD) | None |
end_date | Optional[Union[str, datetime, date]] | Filter images acquired on or before this date (YYYY-MM-DD) | None |
Examples:
Download only recent imagery¶
downloader.download_images_by_tiles( mercator_tiles=tiles, output_dir="output/", start_date="2024-01-01" )
Download imagery from specific date range¶
downloader.download_images_by_tiles( mercator_tiles=tiles, output_dir="output/", start_date="2023-06-01", end_date="2023-12-31" )
Source code in gigaspatial/handlers/maxar_image.py
317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 | |
get_imagery_metadata(bbox=None, cql_filter=None, count=100, output_format='application/json', sort_by=None) ¶
Get metadata for imagery features using WFS GetFeature
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
bbox | Optional[Tuple[float, float, float, float]] | Bounding box (minx, miny, maxx, maxy) in EPSG:4326 Note: Cannot be used together with cql_filter parameter | None |
cql_filter | Optional[str] | CQL filter string for querying by attributes If you need both bbox and filter, include bbox in CQL: "source='WV02' AND BBOX(featureGeometry,x1,y1,x2,y2)" | None |
count | int | Number of features to return (1-1000, default 100) | 100 |
output_format | str | Response format (default: application/json) | 'application/json' |
sort_by | Optional[str] | Sort order, e.g., "acquisitionDate+A" for ascending | None |
Returns:
| Type | Description |
|---|---|
GeoDataFrame | GeoDataFrame with feature metadata |
Example
Query by bounding box¶
metadata = downloader.get_imagery_metadata( bbox=(-105.0, 39.7, -104.9, 39.8) )
Query by CQL filter with bbox¶
metadata = downloader.get_imagery_metadata( cql_filter="source='WV02' AND cloudCover<0.20 AND " "BBOX(featureGeometry,-105.0,39.7,-104.9,39.8)" )
Query by date and sensor¶
metadata = downloader.get_imagery_metadata( cql_filter="(acquisitionDate>='2024-01-01') AND (source='WV03')" )
Source code in gigaspatial/handlers/maxar_image.py
562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 | |
get_metadata_for_bbox(bbox, **kwargs) ¶
Convenience method to get metadata summary for a bounding box
Returns a dict with summary statistics and the full GeoDataFrame
Source code in gigaspatial/handlers/maxar_image.py
microsoft_global_buildings ¶
MSBuildingsConfig dataclass ¶
Bases: BaseHandlerConfig
Configuration for Microsoft Global Buildings dataset files.
Source code in gigaspatial/handlers/microsoft_global_buildings.py
33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 | |
__post_init__() ¶
Initialize the configuration, load tile URLs, and set up location mapping.
Source code in gigaspatial/handlers/microsoft_global_buildings.py
create_location_mapping(similarity_score_threshold=0.8) ¶
Create a mapping between the dataset's location names and ISO 3166-1 alpha-3 country codes.
This function iterates through known countries and attempts to find matching locations in the dataset based on string similarity.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
similarity_score_threshold | float | The minimum similarity score (between 0 and 1) for a dataset location to be considered a match for a country. Defaults to 0.8. | 0.8 |
Returns:
| Type | Description |
|---|---|
| A dictionary where keys are dataset location names and values are | |
| the corresponding ISO 3166-1 alpha-3 country codes. |
Source code in gigaspatial/handlers/microsoft_global_buildings.py
extract_search_geometry(source, **kwargs) ¶
get_relevant_data_units_by_geometry(geometry, **kwargs) ¶
Get the DataFrame of Microsoft Buildings tiles that intersect with a given source spatial geometry.
In case country given, this method first tries to find tiles directly mapped to the given country. If no directly mapped tiles are found and the country is not in the location mapping, it attempts to find overlapping tiles by creating Mercator tiles for the country and filtering the dataset's tiles.
Source code in gigaspatial/handlers/microsoft_global_buildings.py
MSBuildingsDownloader ¶
Bases: BaseHandlerDownloader
A class to handle downloads of Microsoft's Global ML Building Footprints dataset.
Source code in gigaspatial/handlers/microsoft_global_buildings.py
255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 | |
__init__(config=None, data_store=None, logger=None) ¶
Initialize the downloader.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config | Optional[MSBuildingsConfig] | Optional configuration for customizing download behavior and file paths. If None, a default | None |
data_store | Optional[DataStore] | Optional instance of a | None |
logger | Optional[Logger] | Optional custom logger instance. If None, a default logger named after the module is created and used. | None |
Source code in gigaspatial/handlers/microsoft_global_buildings.py
download_by_country(country, data_store=None, country_geom_path=None) ¶
Download Microsoft Global ML Building Footprints data for a specific country.
This is a convenience method to download data for an entire country using its code or name.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
country | str | The country code (e.g., 'USA', 'GBR') or name. | required |
data_store | Optional[DataStore] | Optional instance of a | None |
country_geom_path | Optional[Union[str, Path]] | Optional path to a GeoJSON file containing the country boundary. If provided, this boundary is used instead of the default from | None |
Returns:
| Type | Description |
|---|---|
List[str] | A list of local file paths for the successfully downloaded tiles. |
List[str] | Returns an empty list if no data is found for the country or if |
List[str] | all downloads fail. |
Source code in gigaspatial/handlers/microsoft_global_buildings.py
download_data_unit(tile_info, **kwargs) ¶
Download data file for a single tile.
Source code in gigaspatial/handlers/microsoft_global_buildings.py
download_data_units(tiles, **kwargs) ¶
Download data files for multiple tiles.
Source code in gigaspatial/handlers/microsoft_global_buildings.py
MSBuildingsHandler ¶
Bases: BaseHandler
Handler for Microsoft Global Buildings dataset.
This class provides a unified interface for downloading and loading Microsoft Global Buildings data. It manages the lifecycle of configuration, downloading, and reading components.
Source code in gigaspatial/handlers/microsoft_global_buildings.py
create_config(data_store, logger, **kwargs) ¶
Create and return a MSBuildingsConfig instance.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data_store | DataStore | The data store instance to use | required |
logger | Logger | The logger instance to use | required |
**kwargs | Additional configuration parameters | {} |
Returns:
| Type | Description |
|---|---|
MSBuildingsConfig | Configured MSBuildingsConfig instance |
Source code in gigaspatial/handlers/microsoft_global_buildings.py
create_downloader(config, data_store, logger, **kwargs) ¶
Create and return a MSBuildingsDownloader instance.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config | MSBuildingsConfig | The configuration object | required |
data_store | DataStore | The data store instance to use | required |
logger | Logger | The logger instance to use | required |
**kwargs | Additional downloader parameters | {} |
Returns:
| Type | Description |
|---|---|
MSBuildingsDownloader | Configured MSBuildingsDownloader instance |
Source code in gigaspatial/handlers/microsoft_global_buildings.py
create_reader(config, data_store, logger, **kwargs) ¶
Create and return a MSBuildingsReader instance.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config | MSBuildingsConfig | The configuration object | required |
data_store | DataStore | The data store instance to use | required |
logger | Logger | The logger instance to use | required |
**kwargs | Additional reader parameters | {} |
Returns:
| Type | Description |
|---|---|
MSBuildingsReader | Configured MSBuildingsReader instance |
Source code in gigaspatial/handlers/microsoft_global_buildings.py
MSBuildingsReader ¶
Bases: BaseHandlerReader
Reader for Microsoft Global Buildings data, supporting country, points, and geometry-based resolution.
Source code in gigaspatial/handlers/microsoft_global_buildings.py
load_from_paths(source_data_path, **kwargs) ¶
Load building data from Microsoft Buildings dataset. Args: source_data_path: List of file paths to load Returns: GeoDataFrame containing building data
Source code in gigaspatial/handlers/microsoft_global_buildings.py
ookla_speedtest ¶
OoklaSpeedtestConfig dataclass ¶
Bases: BaseHandlerConfig
Configuration class for Ookla Speedtest data.
This class defines the parameters for accessing and filtering Ookla Speedtest datasets, including available years, quarters, and how dataset URLs are constructed.
Source code in gigaspatial/handlers/ookla_speedtest.py
get_data_unit_path(unit, **kwargs) ¶
Given a Ookla Speedtest file url, return the corresponding path.
OoklaSpeedtestDownloader ¶
Bases: BaseHandlerDownloader
A class to handle the downloading of Ookla Speedtest data.
This downloader focuses on fetching parquet files based on the provided configuration and data unit URLs.
Source code in gigaspatial/handlers/ookla_speedtest.py
OoklaSpeedtestHandler ¶
Bases: BaseHandler
Handler for Ookla Speedtest data.
This class orchestrates the configuration, downloading, and reading of Ookla Speedtest data, allowing for filtering by geographical sources using Mercator tiles.
Source code in gigaspatial/handlers/ookla_speedtest.py
213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 | |
OoklaSpeedtestReader ¶
Bases: BaseHandlerReader
A class to handle reading Ookla Speedtest data.
It loads parquet files into a DataFrame.
Source code in gigaspatial/handlers/ookla_speedtest.py
opencellid ¶
OpenCellIDConfig ¶
Bases: BaseModel
Configuration for OpenCellID data access
Source code in gigaspatial/handlers/opencellid.py
output_file_path: Path property ¶
Path to save the downloaded OpenCellID data
OpenCellIDDownloader ¶
Downloader for OpenCellID data
Source code in gigaspatial/handlers/opencellid.py
71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 | |
download_and_process() ¶
Download and process OpenCellID data for the configured country
Source code in gigaspatial/handlers/opencellid.py
156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 | |
from_country(country, api_token=global_config.OPENCELLID_ACCESS_TOKEN, **kwargs) classmethod ¶
Create a downloader for a specific country
Source code in gigaspatial/handlers/opencellid.py
get_download_links() ¶
Get download links for the country from OpenCellID website
Source code in gigaspatial/handlers/opencellid.py
OpenCellIDReader ¶
Reader for OpenCellID data
Source code in gigaspatial/handlers/opencellid.py
read_data() ¶
Read OpenCellID data for the specified country
Source code in gigaspatial/handlers/opencellid.py
to_geodataframe() ¶
Convert OpenCellID data to a GeoDataFrame
osm ¶
OSMLocationFetcher ¶
A class to fetch and process location data from OpenStreetMap using the Overpass API.
This class supports fetching various OSM location types including amenities, buildings, shops, and other POI categories.
Source code in gigaspatial/handlers/osm.py
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 | |
__post_init__() ¶
Validate inputs, normalize location_types, and set up logging.
Source code in gigaspatial/handlers/osm.py
fetch_locations(since_date=None, handle_duplicates='separate', include_metadata=False) ¶
Fetch OSM locations, optionally filtered by 'since' date.
Use this for incremental updates or getting all current locations.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
since_year | int | Filter for locations added/modified since this year. | required |
handle_duplicates | str | How to handle objects matching multiple categories: - 'separate': Create separate entries for each category (default) - 'combine': Use a single entry with a list of matching categories - 'primary': Keep only the first matching category | 'separate' |
include_metadata | bool | If True, include change tracking metadata (timestamp, version, changeset, user, uid) | False |
Returns:
| Type | Description |
|---|---|
DataFrame | pd.DataFrame: Processed OSM locations |
Source code in gigaspatial/handlers/osm.py
fetch_locations_changed_between(start_date, end_date, handle_duplicates='separate', include_metadata=True) ¶
Fetch OSM locations that changed within a specific date range.
Use this for historical analysis or tracking changes in a specific period.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
start_date | Union[str, datetime] | Start date/time in ISO 8601 format (str: "YYYY-MM-DDThh:mm:ssZ") or datetime object. Changes after this date will be included. | required |
end_date | Union[str, datetime] | End date/time in ISO 8601 format (str: "YYYY-MM-DDThh:mm:ssZ") or datetime object. Changes before this date will be included. | required |
handle_duplicates | Literal['separate', 'combine', 'primary'] | How to handle objects matching multiple categories: - 'separate': Create separate entries for each category (default) - 'combine': Use a single entry with a list of matching categories - 'primary': Keep only the first matching category | 'separate' |
include_metadata | bool | If True, include change tracking metadata (timestamp, version, changeset, user, uid) Defaults to True since change tracking is the main use case. | True |
Returns:
| Type | Description |
|---|---|
DataFrame | pd.DataFrame: Processed OSM locations that changed within the date range |
Raises:
| Type | Description |
|---|---|
ValueError | If dates are invalid or start_date is after end_date |
Source code in gigaspatial/handlers/osm.py
get_admin_names(admin_level, country=None, timeout=120) staticmethod ¶
Fetch all admin area names for a given admin_level (optionally within a country).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
admin_level | int | The OSM admin_level to search for (e.g., 4 for states, 6 for counties). | required |
country | str | Country name or ISO code to filter within. | None |
timeout | int | Timeout for the Overpass API request. | 120 |
Returns:
| Type | Description |
|---|---|
List[str] | List[str]: List of admin area names. |
Source code in gigaspatial/handlers/osm.py
get_osm_countries(iso3_code=None, include_names=True, timeout=1000) staticmethod ¶
Fetch countries from OpenStreetMap database.
This queries the actual OSM database for country boundaries and returns country names as they appear in OSM, including various name translations.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
iso3_code | str | ISO 3166-1 alpha-3 code to fetch a specific country. If provided, returns single country data. If None, returns all countries. | None |
include_names | bool | If True, return dict with multiple name variants. If False, return only the primary name. | True |
timeout | int | Timeout for the Overpass API request (default: 1000). | 1000 |
Returns:
| Type | Description |
|---|---|
Union[str, Dict[str, str], List[str], List[Dict[str, str]]] | When iso3_code is provided: - If include_names=False: Single country name (str) - If include_names=True: Dict with name variants |
Union[str, Dict[str, str], List[str], List[Dict[str, str]]] | When iso3_code is None: - If include_names=False: List of country names - If include_names=True: List of dicts with name variants including: name, name:en, ISO3166-1 codes, and other name translations |
Raises:
| Type | Description |
|---|---|
ValueError | If iso3_code is provided but country not found in OSM. |
Source code in gigaspatial/handlers/osm.py
116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 | |
overture ¶
OvertureAmenityFetcher ¶
Fetch and process amenity locations from the Overture Places theme.
This handler queries the Overture Places GeoParquet on S3, filters by a country boundary, and returns a GeoDataFrame of point locations for the requested amenity categories.
Amenity categories¶
The amenity_types parameter should contain values from categories.primary in the Overture Places schema (for example "hospital", "clinic", "school", "restaurant").
Overture maintains the authoritative category list here: https://github.com/OvertureMaps/schema/blob/main/docs/schema/concepts/by-theme/places/overture_categories.csv
Each entry in that CSV corresponds to a valid value you can pass in amenity_types.
Examples¶
Fetch hospitals in Senegal:
fetcher = OvertureAmenityFetcher(
country="SEN",
amenity_types=["hospital"],
)
hospitals = fetcher.fetch_locations()
Fetch multiple health‑related categories:
fetcher = OvertureAmenityFetcher(
country="SEN",
amenity_types=["hospital", "clinic", "pharmacy"],
)
facilities = fetcher.fetch_locations()
Source code in gigaspatial/handlers/overture.py
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 | |
__post_init__() ¶
Validate inputs and set up logging.
Source code in gigaspatial/handlers/overture.py
fetch_locations(match_pattern=False, **kwargs) ¶
Fetch and process amenity locations.
Source code in gigaspatial/handlers/overture.py
rwi ¶
RWIConfig dataclass ¶
Bases: HDXConfig
Configuration for Relative Wealth Index data access
Source code in gigaspatial/handlers/rwi.py
RWIDownloader ¶
Bases: HDXDownloader
Specialized downloader for the Relative Wealth Index dataset from HDX
Source code in gigaspatial/handlers/rwi.py
RWIHandler ¶
Bases: HDXHandler
Handler for Relative Wealth Index dataset
Source code in gigaspatial/handlers/rwi.py
102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 | |
create_config(data_store, logger, **kwargs) ¶
Create and return a RWIConfig instance
create_downloader(config, data_store, logger, **kwargs) ¶
Create and return a RWIDownloader instance
Source code in gigaspatial/handlers/rwi.py
create_reader(config, data_store, logger, **kwargs) ¶
Create and return a RWIReader instance
Source code in gigaspatial/handlers/rwi.py
RWIReader ¶
Bases: HDXReader
Specialized reader for the Relative Wealth Index dataset from HDX
Source code in gigaspatial/handlers/rwi.py
srtm ¶
nasa_srtm ¶
NasaSRTMConfig dataclass ¶
Bases: BaseHandlerConfig
Configuration for NASA SRTM .hgt tiles (30m or 90m). Creates tile geometries dynamically for 1°x1° grid cells.
Each tile file covers 1 degree latitude x 1 degree longitude.
Source code in gigaspatial/handlers/srtm/nasa_srtm.py
31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 | |
get_data_unit_path(unit, **kwargs) ¶
Given a tile unit or tile_id, return expected storage path.
Source code in gigaspatial/handlers/srtm/nasa_srtm.py
get_data_unit_paths(units, **kwargs) ¶
Given tile identifiers, return list of file paths.
Source code in gigaspatial/handlers/srtm/nasa_srtm.py
NasaSRTMDownloader ¶
Bases: BaseHandlerDownloader
A class to handle downloads of NASA SRTM elevation data.
Source code in gigaspatial/handlers/srtm/nasa_srtm.py
144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 | |
__init__(config=None, data_store=None, logger=None) ¶
Initialize the downloader.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config | Optional[NasaSRTMConfig] | Optional configuration for customizing download behavior and file paths. If None, a default | None |
data_store | Optional[DataStore] | Optional instance of a | None |
logger | Optional[Logger] | Optional custom logger instance. If None, a default logger named after the module is created and used. | None |
Source code in gigaspatial/handlers/srtm/nasa_srtm.py
download_data_unit(tile_info, **kwargs) ¶
Download data file for a single SRTM tile.
Source code in gigaspatial/handlers/srtm/nasa_srtm.py
download_data_units(tiles, **kwargs) ¶
Download data files for multiple SRTM tiles.
Source code in gigaspatial/handlers/srtm/nasa_srtm.py
NasaSRTMHandler ¶
Bases: BaseHandler
Main handler class for NASA SRTM elevation data.
Source code in gigaspatial/handlers/srtm/nasa_srtm.py
create_config(data_store, logger, **kwargs) ¶
Create and return a NasaSRTMConfig instance.
create_downloader(config, data_store, logger, **kwargs) ¶
Create and return a NasaSRTMDownloader instance.
Source code in gigaspatial/handlers/srtm/nasa_srtm.py
create_reader(config, data_store, logger, **kwargs) ¶
Create and return a NasaSRTMReader instance.
Source code in gigaspatial/handlers/srtm/nasa_srtm.py
NasaSRTMReader ¶
Bases: BaseHandlerReader
A class to handle reading of NASA SRTM elevation data.
Source code in gigaspatial/handlers/srtm/nasa_srtm.py
__init__(config=None, data_store=None, logger=None) ¶
Initialize the reader.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config | Optional[NasaSRTMConfig] | Optional configuration for customizing reading behavior and file paths. If None, a default | None |
data_store | Optional[DataStore] | Optional instance of a | None |
logger | Optional[Logger] | Optional custom logger instance. If None, a default logger named after the module is created and used. | None |
Source code in gigaspatial/handlers/srtm/nasa_srtm.py
load_from_paths(source_data_path, **kwargs) ¶
Load SRTM elevation data from file paths.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source_data_path | List[Union[str, Path]] | List of SRTM .hgt.zip file paths | required |
**kwargs | Additional parameters for data loading - as_dataframe: bool, default=True. If True, return concatenated DataFrame. If False, return list of SRTMParser objects. - dropna: bool, default=True. If True, drop rows with NaN elevation values. | {} |
Returns:
| Type | Description |
|---|---|
Union[DataFrame, List[SRTMParser]] | Union[pd.DataFrame, List[SRTMParser]]: Loaded elevation data |
Source code in gigaspatial/handlers/srtm/nasa_srtm.py
srtm_manager ¶
SRTMManager ¶
Manager for accessing elevation data across multiple SRTM .hgt.zip files.
Implements lazy loading with LRU caching for efficient memory usage. Automatically handles multiple tiles for elevation profiles.
Source code in gigaspatial/handlers/srtm/srtm_manager.py
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 | |
__init__(srtm_directory, downloader=None, cache_size=10, data_store=None) ¶
Initialize the SRTM Manager.
Parameters¶
srtm_directory : str or Path Directory containing .hgt.zip files downloader : optional Downloader instance for auto-downloading missing tiles cache_size : int, default=10 Maximum number of SRTM tiles to keep in memory (LRU cache) data_store : DataStore, optional Data store for reading files. Priority: provided data_store > downloader.data_store > LocalDataStore()
Source code in gigaspatial/handlers/srtm/srtm_manager.py
check_coverage(latitude, longitude) ¶
Check if a specific coordinate has SRTM coverage.
Parameters¶
latitude : float Latitude in decimal degrees longitude : float Longitude in decimal degrees
Returns¶
bool True if tile is available, False otherwise
Source code in gigaspatial/handlers/srtm/srtm_manager.py
clear_cache() ¶
get_available_tiles() ¶
get_cache_info() ¶
get_elevation(latitude, longitude) ¶
Get interpolated elevation for a specific coordinate.
Automatically finds and loads the correct SRTM tile.
Parameters¶
latitude : float Latitude in decimal degrees (-90 to 90) longitude : float Longitude in decimal degrees (-180 to 180)
Returns¶
float Interpolated elevation in meters
Raises¶
FileNotFoundError If the required SRTM tile is not available
Source code in gigaspatial/handlers/srtm/srtm_manager.py
get_elevation_batch(coordinates) ¶
Get elevations for multiple coordinates efficiently.
Groups coordinates by tile to minimize parser loads.
Parameters¶
coordinates : np.ndarray of shape (n, 2) Array of (latitude, longitude) pairs
Returns¶
np.ndarray of shape (n,) Elevations in meters
Raises¶
FileNotFoundError If any required SRTM tile is not available
Source code in gigaspatial/handlers/srtm/srtm_manager.py
get_elevation_profile(start_lat, start_lon, end_lat, end_lon, num_points=100) ¶
Get elevation profile between two points.
Uses linear interpolation between points and automatically handles multiple SRTM tiles. For more accurate great circle paths over long distances, consider using geopy.
Parameters¶
start_lat : float Starting latitude in decimal degrees start_lon : float Starting longitude in decimal degrees end_lat : float Ending latitude in decimal degrees end_lon : float Ending longitude in decimal degrees num_points : int, default=100 Number of sample points along the path
Returns¶
pd.DataFrame DataFrame with columns: distance_km, latitude, longitude, elevation
Raises¶
FileNotFoundError If any required SRTM tile along the path is not available
Source code in gigaspatial/handlers/srtm/srtm_manager.py
srtm_parser ¶
SRTMParser ¶
Efficient parser for NASA SRTM .hgt.zip files.
Supports both SRTM-1 (3601x3601, 1 arc-second) and SRTM-3 (1201x1201, 3 arc-second) formats. Uses memory mapping for efficient handling of large files.
Source code in gigaspatial/handlers/srtm/srtm_parser.py
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 | |
__init__(hgt_zip_path, data_store=None) ¶
Initialize the SRTM parser.
Parameters¶
hgt_zip_path : str or Path Path to the .hgt.zip file (e.g., 'S03E028.SRTMGL1.hgt.zip') data_store : DataStore, optional Data store for reading files. If None, uses LocalDataStore()
Source code in gigaspatial/handlers/srtm/srtm_parser.py
get_elevation(latitude, longitude) ¶
Get interpolated elevation for a specific coordinate.
Uses bilinear interpolation for accurate elevation values between grid points.
Parameters¶
latitude : float Latitude in decimal degrees longitude : float Longitude in decimal degrees
Returns¶
float Interpolated elevation in meters, or np.nan if outside tile bounds
Source code in gigaspatial/handlers/srtm/srtm_parser.py
get_elevation_batch(coordinates) ¶
Get interpolated elevations for multiple coordinates (vectorized).
Parameters¶
coordinates : np.ndarray of shape (n, 2) Array of (latitude, longitude) pairs
Returns¶
np.ndarray of shape (n,) Interpolated elevations in meters
Source code in gigaspatial/handlers/srtm/srtm_parser.py
get_tile_info() ¶
Get information about the SRTM tile.
Returns¶
dict Dictionary containing tile metadata
Source code in gigaspatial/handlers/srtm/srtm_parser.py
to_array() ¶
Return elevation data in square array form with coordinate arrays.
Returns¶
tuple of (elevation_array, latitudes, longitudes) elevation_array : np.ndarray of shape (size, size) 2D array of elevation values in meters latitudes : np.ndarray of shape (size,) Latitude values for each row (north to south) longitudes : np.ndarray of shape (size,) Longitude values for each column (west to east)
Source code in gigaspatial/handlers/srtm/srtm_parser.py
to_dataframe(dropna=True) ¶
Convert elevation data to a DataFrame with coordinates.
Returns¶
pd.DataFrame DataFrame with columns: latitude, longitude, elevation
Source code in gigaspatial/handlers/srtm/srtm_parser.py
utils ¶
EarthdataSession ¶
Bases: Session
Custom requests.Session for NASA Earthdata authentication.
Maintains Authorization headers through redirects to/from Earthdata hosts. This is required because Earthdata uses multiple redirect domains during authentication.
Source code in gigaspatial/handlers/srtm/utils.py
rebuild_auth(prepared_request, response) ¶
Keep auth header on redirects to/from Earthdata host.
Source code in gigaspatial/handlers/srtm/utils.py
unicef_georepo ¶
GeoRepoClient ¶
A client for interacting with the GeoRepo API.
GeoRepo is a platform for managing and accessing geospatial administrative boundary data. This client provides methods to search, retrieve, and work with modules, datasets, views, and administrative entities.
Attributes:
| Name | Type | Description |
|---|---|---|
base_url | str | The base URL for the GeoRepo API |
api_key | str | The API key for authentication |
email | str | The email address associated with the API key |
headers | dict | HTTP headers used for API requests |
Source code in gigaspatial/handlers/unicef_georepo.py
6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 | |
__init__(api_key=None, email=None) ¶
Initialize the GeoRepo client.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
api_key | str | GeoRepo API key. If not provided, will use the GEOREPO_API_KEY environment variable from config. | None |
email | str | Email address associated with the API key. If not provided, will use the GEOREPO_USER_EMAIL environment variable from config. | None |
Raises:
| Type | Description |
|---|---|
ValueError | If api_key or email is not provided and cannot be found in environment variables. |
Source code in gigaspatial/handlers/unicef_georepo.py
check_connection() ¶
Checks if the API connection is valid by making a simple request.
Returns:
| Name | Type | Description |
|---|---|---|
bool | True if the connection is valid, False otherwise. |
Source code in gigaspatial/handlers/unicef_georepo.py
find_country_by_iso3(view_uuid, iso3_code) ¶
Find a country entity using its ISO3 country code.
This method searches through all level-0 (country) entities to find one that matches the provided ISO3 code. It checks both the entity's Ucode and any external codes stored in the ext_codes field.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
view_uuid | str | The UUID of the view to search within. | required |
iso3_code | str | The ISO3 country code to search for (e.g., 'USA', 'KEN', 'BRA'). | required |
Returns:
| Type | Description |
|---|---|
| dict or None: Entity information dictionary for the matching country if found, including properties like name, ucode, admin_level, etc. Returns None if no matching country is found. |
Note
This method handles pagination automatically to search through all available countries in the dataset, which may involve multiple API calls.
Raises:
| Type | Description |
|---|---|
HTTPError | If the API request fails or view_uuid is invalid. |
Source code in gigaspatial/handlers/unicef_georepo.py
get_admin_boundaries(view_uuid, admin_level=None, geom='full_geom', format='geojson') ¶
Get administrative boundaries for a specific level or all levels.
This is a convenience method that can retrieve boundaries for a single administrative level or attempt to fetch all available levels.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
view_uuid | str | The UUID of the view to query. | required |
admin_level | int | Administrative level to retrieve (0=country, 1=region, etc.). If None, attempts to fetch all levels. | None |
geom | str | Geometry inclusion level. Options: - "no_geom": No geometry data - "centroid": Only centroid points - "full_geom": Complete boundary geometries Defaults to "full_geom". | 'full_geom' |
format | str | Response format ("json" or "geojson"). Defaults to "geojson". | 'geojson' |
Returns:
| Name | Type | Description |
|---|---|---|
dict | JSON/GeoJSON response containing administrative boundaries in the specified format. For GeoJSON, returns a FeatureCollection. |
Raises:
| Type | Description |
|---|---|
HTTPError | If the API request fails or parameters are invalid. |
Source code in gigaspatial/handlers/unicef_georepo.py
get_dataset_details(dataset_uuid) ¶
Get detailed information about a specific dataset.
This includes metadata about the dataset and information about available administrative levels (e.g., country, province, district).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dataset_uuid | str | The UUID of the dataset to query. | required |
Returns:
| Name | Type | Description |
|---|---|---|
dict | JSON response containing dataset details including: - Basic metadata (name, description, etc.) - Available administrative levels and their properties - Temporal information and data sources |
Raises:
| Type | Description |
|---|---|
HTTPError | If the API request fails or dataset_uuid is invalid. |
Source code in gigaspatial/handlers/unicef_georepo.py
get_entity_by_ucode(ucode, geom='full_geom', format='geojson') ¶
Get detailed information about a specific entity using its Ucode.
A Ucode (Universal Code) is a unique identifier for geographic entities within the GeoRepo system, typically in the format "ISO3_LEVEL_NAME".
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ucode | str | The unique code identifier for the entity. | required |
geom | str | Geometry inclusion level. Options: - "no_geom": No geometry data - "centroid": Only centroid points - "full_geom": Complete boundary geometries Defaults to "full_geom". | 'full_geom' |
format | str | Response format ("json" or "geojson"). Defaults to "geojson". | 'geojson' |
Returns:
| Name | Type | Description |
|---|---|---|
dict | JSON/GeoJSON response containing entity details including geometry, properties, administrative level, and metadata. |
Raises:
| Type | Description |
|---|---|
HTTPError | If the API request fails or ucode is invalid. |
Source code in gigaspatial/handlers/unicef_georepo.py
get_vector_tiles_url(view_info) ¶
Generate an authenticated URL for accessing vector tiles.
Vector tiles are used for efficient map rendering and can be consumed by mapping libraries like Mapbox GL JS or OpenLayers.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
view_info | dict | Dictionary containing view information that must include a 'vector_tiles' key with the base vector tiles URL. | required |
Returns:
| Name | Type | Description |
|---|---|---|
str | Fully authenticated vector tiles URL with API key and user email parameters appended for access control. |
Raises:
| Type | Description |
|---|---|
ValueError | If 'vector_tiles' key is not found in view_info. |
Source code in gigaspatial/handlers/unicef_georepo.py
list_datasets_by_module(module_uuid) ¶
List all datasets within a specific module.
A dataset represents a collection of related geographic entities, such as administrative boundaries for a specific country or region.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
module_uuid | str | The UUID of the module to query. | required |
Returns:
| Name | Type | Description |
|---|---|---|
dict | JSON response containing a list of datasets with their metadata. Each dataset includes 'uuid', 'name', 'description', creation date, etc. |
Raises:
| Type | Description |
|---|---|
HTTPError | If the API request fails or module_uuid is invalid. |
Source code in gigaspatial/handlers/unicef_georepo.py
list_entities_by_admin_level(view_uuid, admin_level, geom='no_geom', format='json', page=1, page_size=50) ¶
List entities at a specific administrative level within a view.
Administrative levels typically follow a hierarchy: - Level 0: Countries - Level 1: States/Provinces/Regions - Level 2: Districts/Counties - Level 3: Sub-districts/Municipalities - And so on...
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
view_uuid | str | The UUID of the view to query. | required |
admin_level | int | The administrative level to retrieve (0, 1, 2, etc.). | required |
geom | str | Geometry inclusion level. Options: - "no_geom": No geometry data - "centroid": Only centroid points - "full_geom": Complete boundary geometries Defaults to "no_geom". | 'no_geom' |
format | str | Response format ("json" or "geojson"). Defaults to "json". | 'json' |
page | int | Page number for pagination. Defaults to 1. | 1 |
page_size | int | Number of results per page. Defaults to 50. | 50 |
Returns:
| Name | Type | Description |
|---|---|---|
tuple | A tuple containing: - dict: JSON/GeoJSON response with entity data - dict: Metadata with pagination info (page, total_page, total_count) |
Raises:
| Type | Description |
|---|---|
HTTPError | If the API request fails or parameters are invalid. |
Source code in gigaspatial/handlers/unicef_georepo.py
list_entity_children(view_uuid, entity_ucode, geom='no_geom', format='json') ¶
List direct children of an entity in the administrative hierarchy.
For example, if given a country entity, this will return its states/provinces. If given a state entity, this will return its districts/counties.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
view_uuid | str | The UUID of the view containing the entity. | required |
entity_ucode | str | The Ucode of the parent entity. | required |
geom | str | Geometry inclusion level. Options: - "no_geom": No geometry data - "centroid": Only centroid points - "full_geom": Complete boundary geometries Defaults to "no_geom". | 'no_geom' |
format | str | Response format ("json" or "geojson"). Defaults to "json". | 'json' |
Returns:
| Name | Type | Description |
|---|---|---|
dict | JSON/GeoJSON response containing list of child entities with their properties and optional geometry data. |
Raises:
| Type | Description |
|---|---|
HTTPError | If the API request fails or parameters are invalid. |
Source code in gigaspatial/handlers/unicef_georepo.py
list_modules() ¶
List all available modules in GeoRepo.
A module is a top-level organizational unit that contains datasets. Examples include "Admin Boundaries", "Health Facilities", etc.
Returns:
| Name | Type | Description |
|---|---|---|
dict | JSON response containing a list of modules with their metadata. Each module includes 'uuid', 'name', 'description', and other properties. |
Raises:
| Type | Description |
|---|---|
HTTPError | If the API request fails. |
Source code in gigaspatial/handlers/unicef_georepo.py
list_views_by_dataset(dataset_uuid, page=1, page_size=50) ¶
List views for a dataset with pagination support.
A view represents a specific version or subset of a dataset. Views may be tagged as 'latest' or represent different time periods.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dataset_uuid | str | The UUID of the dataset to query. | required |
page | int | Page number for pagination. Defaults to 1. | 1 |
page_size | int | Number of results per page. Defaults to 50. | 50 |
Returns:
| Name | Type | Description |
|---|---|---|
dict | JSON response containing paginated list of views with metadata. Includes 'results', 'total_page', 'current_page', and 'count' fields. Each view includes 'uuid', 'name', 'tags', and other properties. |
Raises:
| Type | Description |
|---|---|
HTTPError | If the API request fails or dataset_uuid is invalid. |
Source code in gigaspatial/handlers/unicef_georepo.py
search_entities_by_name(view_uuid, name, page=1, page_size=50) ¶
Search for entities by name using fuzzy matching.
This performs a similarity-based search to find entities whose names match or are similar to the provided search term.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
view_uuid | str | The UUID of the view to search within. | required |
name | str | The name or partial name to search for. | required |
page | int | Page number for pagination. Defaults to 1. | 1 |
page_size | int | Number of results per page. Defaults to 50. | 50 |
Returns:
| Name | Type | Description |
|---|---|---|
dict | JSON response containing paginated search results with matching entities and their similarity scores. |
Raises:
| Type | Description |
|---|---|
HTTPError | If the API request fails or parameters are invalid. |
Source code in gigaspatial/handlers/unicef_georepo.py
find_admin_boundaries_module() ¶
Find and return the UUID of the Admin Boundaries module.
This is a convenience function that searches through all available modules to locate the one named "Admin Boundaries", which typically contains administrative boundary datasets.
Returns:
| Name | Type | Description |
|---|---|---|
str | The UUID of the Admin Boundaries module. |
Raises:
| Type | Description |
|---|---|
ValueError | If the Admin Boundaries module is not found. |
Source code in gigaspatial/handlers/unicef_georepo.py
get_country_boundaries_by_iso3(iso3_code, client=None, admin_level=None) ¶
Get administrative boundaries for a specific country using its ISO3 code.
This function provides a high-level interface to retrieve country boundaries by automatically finding the appropriate module, dataset, and view, then fetching the requested administrative boundaries.
The function will: 1. Find the Admin Boundaries module 2. Locate a global dataset within that module 3. Find the latest view of that dataset 4. Search for the country using the ISO3 code 5. Look for a country-specific view if available 6. Retrieve boundaries at the specified admin level or all levels
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
iso3_code | str | The ISO3 country code (e.g., 'USA', 'KEN', 'BRA'). | required |
admin_level | int | The administrative level to retrieve: - 0: Country level - 1: State/Province/Region level - 2: District/County level - 3: Sub-district/Municipality level - etc. If None, retrieves all available administrative levels. | None |
Returns:
| Name | Type | Description |
|---|---|---|
dict | A GeoJSON FeatureCollection containing the requested boundaries. Each feature includes geometry and properties for the administrative unit. |
Raises:
| Type | Description |
|---|---|
ValueError | If the Admin Boundaries module, datasets, views, or country cannot be found. |
HTTPError | If any API requests fail. |
Note
This function may make multiple API calls and can take some time for countries with many administrative units. It handles pagination automatically and attempts to use country-specific views when available for better performance.
Example
Get all administrative levels for Kenya¶
boundaries = get_country_boundaries_by_iso3('KEN')
Get only province-level boundaries for Kenya¶
provinces = get_country_boundaries_by_iso3('KEN', admin_level=1)
Source code in gigaspatial/handlers/unicef_georepo.py
485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 | |
worldpop ¶
WPPopulationConfig dataclass ¶
Bases: BaseHandlerConfig
Source code in gigaspatial/handlers/worldpop.py
361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 | |
extract_search_geometry(source, **kwargs) ¶
Override the method since geometry extraction does not apply. Returns country iso3 for dataset search
Source code in gigaspatial/handlers/worldpop.py
get_data_unit_path(unit, **kwargs) ¶
get_data_unit_paths(units, **kwargs) ¶
Given WP file url(s), return the corresponding local file paths.
- For school_age age_structures (zip resources), if extracted .tif files are present in the target directory, return those; otherwise, return the zip path(s) to allow the downloader to fetch and extract them.
- For non-school_age age_structures (individual .tif URLs), you can filter by sex and age using kwargs: sex, ages, min_age, max_age.
Source code in gigaspatial/handlers/worldpop.py
739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 | |
validate_configuration() ¶
Validate and normalise configuration based on dataset availability constraints.
GR1 (pop): - Constrained: only 2020 at 100m, with or without UN adjustment. - Unconstrained: 100m or 1km, with or without UN adjustment, years 2000-2020.
GR1 (age_structures): - School age: only 2020, 1km, unconstrained, no UN adjustment. - Non-school age: 100m only. - Unconstrained: no UN adjustment. - Constrained + UN adjusted: only 2020. - Constrained, no UN adjustment: only 2020.
GR2 (pop): - Constrained datasets only. - Years 2015-2030 at 100m or 1km. - No UN adjustment.
GR2 (age_structures): - Constrained datasets only, no UN adjustment, no school age. - Years 2015-2030. - Full age structures: 100m or 1km. - Under-18 population (under_18=True): 100m only.
Source code in gigaspatial/handlers/worldpop.py
527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 | |
WPPopulationDownloader ¶
Bases: BaseHandlerDownloader
Source code in gigaspatial/handlers/worldpop.py
845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 | |
__init__(config, data_store=None, logger=None) ¶
Initialize the downloader.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config | Union[WPPopulationConfig, dict[str, Union[str, int]]] | Configuration for the WorldPop dataset, either as a WPPopulationConfig object or a dictionary of parameters | required |
data_store | Optional[DataStore] | Optional data storage interface. If not provided, uses LocalDataStore. | None |
logger | Optional[Logger] | Optional custom logger. If not provided, uses default logger. | None |
Source code in gigaspatial/handlers/worldpop.py
download_data_unit(url, **kwargs) ¶
Download data file for a url. If a zip, extract contained .tif files.
Source code in gigaspatial/handlers/worldpop.py
868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 | |
download_data_units(urls, **kwargs) ¶
Download data files for multiple urls.
Source code in gigaspatial/handlers/worldpop.py
WPPopulationHandler ¶
Bases: BaseHandler
Handler for WorldPop Populations datasets.
This class provides a unified interface for downloading and loading WP Population data. It manages the lifecycle of configuration, downloading, and reading components.
Source code in gigaspatial/handlers/worldpop.py
1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 | |
create_config(data_store, logger, **kwargs) ¶
Create and return a WPPopulationConfig instance.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data_store | DataStore | The data store instance to use | required |
logger | Logger | The logger instance to use | required |
**kwargs | Additional configuration parameters | {} |
Returns:
| Type | Description |
|---|---|
WPPopulationConfig | Configured WPPopulationConfig instance |
Source code in gigaspatial/handlers/worldpop.py
create_downloader(config, data_store, logger, **kwargs) ¶
Create and return a WPPopulationDownloader instance.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config | WPPopulationConfig | The configuration object | required |
data_store | DataStore | The data store instance to use | required |
logger | Logger | The logger instance to use | required |
**kwargs | Additional downloader parameters | {} |
Returns:
| Type | Description |
|---|---|
WPPopulationDownloader | Configured WPPopulationDownloader instance |
Source code in gigaspatial/handlers/worldpop.py
create_reader(config, data_store, logger, **kwargs) ¶
Create and return a WPPopulationReader instance.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config | WPPopulationConfig | The configuration object | required |
data_store | DataStore | The data store instance to use | required |
logger | Logger | The logger instance to use | required |
**kwargs | Additional reader parameters | {} |
Returns:
| Type | Description |
|---|---|
WPPopulationReader | Configured WPPopulationReader instance |
Source code in gigaspatial/handlers/worldpop.py
load_into_dataframe(source, ensure_available=True, **kwargs) ¶
Load GHSL data into a pandas DataFrame.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source | str | The data source specification | required |
ensure_available | bool | If True, ensure data is downloaded before loading | True |
**kwargs | Additional parameters passed to load methods | {} |
Returns:
| Type | Description |
|---|---|
DataFrame | DataFrame containing the GHSL data |
Source code in gigaspatial/handlers/worldpop.py
load_into_geodataframe(source, ensure_available=True, **kwargs) ¶
Load GHSL data into a geopandas GeoDataFrame.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source | str | The data source specification | required |
ensure_available | bool | If True, ensure data is downloaded before loading | True |
**kwargs | Additional parameters passed to load methods | {} |
Returns:
| Type | Description |
|---|---|
GeoDataFrame | GeoDataFrame containing the GHSL data |
Source code in gigaspatial/handlers/worldpop.py
WPPopulationReader ¶
Bases: BaseHandlerReader
Source code in gigaspatial/handlers/worldpop.py
994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 | |
__init__(config, data_store=None, logger=None) ¶
Initialize the reader.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config | Union[WPPopulationConfig, dict[str, Union[str, int]]] | Configuration for the WorldPop dataset, either as a WPPopulationConfig object or a dictionary of parameters | required |
data_store | Optional[DataStore] | Optional data storage interface. If not provided, uses LocalDataStore. | None |
logger | Optional[Logger] | Optional custom logger. If not provided, uses default logger. | None |
Source code in gigaspatial/handlers/worldpop.py
load_from_paths(source_data_path, merge_rasters=False, **kwargs) ¶
Load TifProcessors of WP datasets. Args: source_data_path: List of file paths to load merge_rasters: If True, all rasters will be merged into a single TifProcessor. Defaults to False. Returns: Union[List[TifProcessor], TifProcessor]: List of TifProcessor objects for accessing the raster data or a single TifProcessor if merge_rasters is True.
Source code in gigaspatial/handlers/worldpop.py
WorldPopRestClient ¶
REST API client for WorldPop data access.
This class provides direct access to the WorldPop REST API without any configuration dependencies, allowing flexible integration patterns.
Source code in gigaspatial/handlers/worldpop.py
34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 | |
__enter__() ¶
__exit__(exc_type, exc_val, exc_tb) ¶
__init__(base_url='https://www.worldpop.org/rest/data', stats_url='https://api.worldpop.org/v1/services/stats', api_key=None, timeout=600, logger=None) ¶
Initialize the WorldPop REST API client.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
base_url | str | Base URL for the WorldPop REST API | 'https://www.worldpop.org/rest/data' |
stats_url | str | URL for the WorldPop statistics API | 'https://api.worldpop.org/v1/services/stats' |
api_key | Optional[str] | Optional API key for higher rate limits | None |
timeout | int | Request timeout in seconds | 600 |
logger | Optional[Logger] | Optional logger instance | None |
Source code in gigaspatial/handlers/worldpop.py
close() ¶
find_dataset(dataset_type, category, iso3, year, **filters) ¶
Find a specific dataset by year and optional filters.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dataset_type | str | Dataset type alias | required |
category | str | Category alias | required |
iso3 | str | ISO3 country code | required |
year | Union[str, int] | Year to search for | required |
**filters | Additional filters (e.g., gender='F', resolution='1km') | {} |
Returns:
| Type | Description |
|---|---|
Optional[Dict[str, Any]] | Dataset dictionary or None if not found |
Source code in gigaspatial/handlers/worldpop.py
get_available_projects() ¶
Get list of all available projects (e.g., population, births, pregnancies, etc.).
Returns:
| Type | Description |
|---|---|
List[Dict[str, Any]] | List of project dictionaries with alias, name, title, and description |
Source code in gigaspatial/handlers/worldpop.py
get_dataset_by_id(dataset_type, category, dataset_id) ¶
Get dataset information by ID.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dataset_type | str | Dataset type alias (e.g., 'pop', 'births') | required |
category | str | Category alias (e.g., 'wpgp', 'pic') | required |
dataset_id | str | Dataset ID | required |
Returns:
| Type | Description |
|---|---|
Optional[Dict[str, Any]] | Dataset dictionary or None if not found |
Source code in gigaspatial/handlers/worldpop.py
get_dataset_info(dataset) ¶
Extract useful information from a dataset dictionary.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dataset | Dict[str, Any] | Dataset dictionary from API | required |
Returns:
| Type | Description |
|---|---|
Dict[str, Any] | Cleaned dataset information |
Source code in gigaspatial/handlers/worldpop.py
get_datasets(dataset_type, category, params) ¶
Get all datasets available for the params.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dataset_type | str | Dataset type alias (e.g., 'pop', 'births') | required |
category | str | Category alias (e.g., 'wpgp', 'pic') | required |
params | dict | Query parameters (e.g., {'iso3`:'RWA'}) | required |
Returns:
| Type | Description |
|---|---|
| List of dataset dictionaries with metadata and file information |
Source code in gigaspatial/handlers/worldpop.py
get_datasets_by_country(dataset_type, category, iso3) ¶
Get all datasets available for a specific country.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dataset_type | str | Dataset type alias (e.g., 'pop', 'births') | required |
category | str | Category alias (e.g., 'wpgp', 'pic') | required |
iso3 | str | ISO3 country code (e.g., 'USA', 'BRA') | required |
Returns:
| Type | Description |
|---|---|
List[Dict[str, Any]] | List of dataset dictionaries with metadata and file information |
Source code in gigaspatial/handlers/worldpop.py
get_project_sources(dataset_type) ¶
Get available sources for a specific project type.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dataset_type | str | Project type alias (e.g., 'pop', 'births', 'pregnancies') | required |
Returns:
| Type | Description |
|---|---|
List[Dict[str, Any]] | List of source dictionaries with alias and name |
Source code in gigaspatial/handlers/worldpop.py
get_source_entities(dataset_type, category) ¶
Get list of entities (countries, global, continental) available for a specific project type and source.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dataset_type | str | Project type alias (e.g., 'pop', 'births') | required |
category | str | Source alias (e.g., 'wpgp', 'pic') | required |
Returns:
| Type | Description |
|---|---|
List[Dict[str, Any]] | List of entity dictionaries with id and iso3 codes (if applicable) |
Source code in gigaspatial/handlers/worldpop.py
list_years_for_country(dataset_type, category, iso3) ¶
List all available years for a specific country and dataset.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dataset_type | str | Dataset type alias | required |
category | str | Category alias | required |
iso3 | str | ISO3 country code | required |
Returns:
| Type | Description |
|---|---|
List[int] | Sorted list of available years |
Source code in gigaspatial/handlers/worldpop.py
search_datasets(dataset_type=None, category=None, iso3=None, year=None, **filters) ¶
Search for datasets with flexible filtering.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dataset_type | Optional[str] | Optional dataset type filter | None |
category | Optional[str] | Optional category filter | None |
iso3 | Optional[str] | Optional country filter | None |
year | Optional[Union[str, int]] | Optional year filter | None |
**filters | Additional filters | {} |
Returns:
| Type | Description |
|---|---|
List[Dict[str, Any]] | List of matching datasets |