Handlers Module¶
gigaspatial.handlers
¶
base
¶
BaseHandler
¶
Bases: ABC
Abstract base class that orchestrates configuration, downloading, and reading functionality.
This class serves as the main entry point for dataset handlers, providing a unified interface for data acquisition and loading. It manages the lifecycle of config, downloader, and reader components.
Subclasses should implement the abstract methods to provide specific handler types and define how components are created and interact.
Source code in gigaspatial/handlers/base.py
424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 |
|
config: BaseHandlerConfig
property
¶
Get the configuration object.
downloader: BaseHandlerDownloader
property
¶
Get the downloader object.
reader: BaseHandlerReader
property
¶
Get the reader object.
__enter__()
¶
__exit__(exc_type, exc_val, exc_tb)
¶
__init__(config=None, downloader=None, reader=None, data_store=None, logger=None)
¶
Initialize the BaseHandler with optional components.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
config | Optional[BaseHandlerConfig] | Configuration object. If None, will be created via create_config() | None |
downloader | Optional[BaseHandlerDownloader] | Downloader instance. If None, will be created via create_downloader() | None |
reader | Optional[BaseHandlerReader] | Reader instance. If None, will be created via create_reader() | None |
data_store | Optional[DataStore] | Data store instance. Defaults to LocalDataStore if not provided | None |
logger | Optional[Logger] | Logger instance. If not provided, creates one based on class name | None |
Source code in gigaspatial/handlers/base.py
__repr__()
¶
String representation of the handler.
Source code in gigaspatial/handlers/base.py
cleanup()
¶
Cleanup resources used by the handler.
Override in subclasses if specific cleanup is needed.
create_config(data_store, logger, **kwargs)
abstractmethod
¶
Create and return a configuration object for this handler.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data_store | DataStore | The data store instance to use | required |
logger | Logger | The logger instance to use | required |
**kwargs | Additional configuration parameters | {} |
Returns:
Type | Description |
---|---|
BaseHandlerConfig | Configured BaseHandlerConfig instance |
Source code in gigaspatial/handlers/base.py
create_downloader(config, data_store, logger, **kwargs)
abstractmethod
¶
Create and return a downloader object for this handler.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
config | BaseHandlerConfig | The configuration object | required |
data_store | DataStore | The data store instance to use | required |
logger | Logger | The logger instance to use | required |
**kwargs | Additional downloader parameters | {} |
Returns:
Type | Description |
---|---|
BaseHandlerDownloader | Configured BaseHandlerDownloader instance |
Source code in gigaspatial/handlers/base.py
create_reader(config, data_store, logger, **kwargs)
abstractmethod
¶
Create and return a reader object for this handler.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
config | BaseHandlerConfig | The configuration object | required |
data_store | DataStore | The data store instance to use | required |
logger | Logger | The logger instance to use | required |
**kwargs | Additional reader parameters | {} |
Returns:
Type | Description |
---|---|
BaseHandlerReader | Configured BaseHandlerReader instance |
Source code in gigaspatial/handlers/base.py
download_and_load(source, force_download=False, **kwargs)
¶
Convenience method to download (if needed) and load data in one call.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
source | Union[str, List[Union[tuple, Point]], BaseGeometry, GeoDataFrame, Path, List[Union[str, Path]]] | The data source specification | required |
force_download | bool | If True, download even if data exists locally | False |
**kwargs | Additional parameters | {} |
Returns:
Type | Description |
---|---|
Any | Loaded data |
Source code in gigaspatial/handlers/base.py
ensure_data_available(source, force_download=False, **kwargs)
¶
Ensure that data is available for the given source.
This method checks if the required data exists locally, and if not (or if force_download is True), downloads it using the downloader.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
source | Union[str, List[Union[tuple, Point]], BaseGeometry, GeoDataFrame, Path, List[Union[str, Path]]] | The data source specification | required |
force_download | bool | If True, download even if data exists locally | False |
**kwargs | Additional parameters passed to download methods | {} |
Returns:
Name | Type | Description |
---|---|---|
bool | bool | True if data is available after this operation |
Source code in gigaspatial/handlers/base.py
get_available_data_info(source, **kwargs)
¶
Get information about available data for the given source.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
source | Union[str, List[Union[tuple, Point]], BaseGeometry, GeoDataFrame] | The data source specification | required |
**kwargs | Additional parameters | {} |
Returns:
Name | Type | Description |
---|---|---|
dict | dict | Information about data availability, paths, etc. |
Source code in gigaspatial/handlers/base.py
load_data(source, ensure_available=True, **kwargs)
¶
Load data from the given source.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
source | Union[str, List[Union[tuple, Point]], BaseGeometry, GeoDataFrame, Path, List[Union[str, Path]]] | The data source specification | required |
ensure_available | bool | If True, ensure data is downloaded before loading | True |
**kwargs | Additional parameters passed to load methods | {} |
Returns:
Type | Description |
---|---|
Any | Loaded data (type depends on specific handler implementation) |
Source code in gigaspatial/handlers/base.py
BaseHandlerConfig
dataclass
¶
Bases: ABC
Abstract base class for handler configuration objects. Provides standard fields for path, parallelism, data store, and logger. Extend this class for dataset-specific configuration.
Source code in gigaspatial/handlers/base.py
get_data_unit_path(unit, **kwargs)
abstractmethod
¶
get_data_unit_paths(units, **kwargs)
¶
Given data unit identifiers, return the corresponding file paths.
Source code in gigaspatial/handlers/base.py
get_relevant_data_units_by_country(country, **kwargs)
¶
Given a country code or name, return a list of relevant data unit identifiers.
Source code in gigaspatial/handlers/base.py
get_relevant_data_units_by_geometry(geometry, **kwargs)
abstractmethod
¶
Given a geometry, return a list of relevant data unit identifiers (e.g., tiles, files, resources).
Source code in gigaspatial/handlers/base.py
get_relevant_data_units_by_points(points, **kwargs)
abstractmethod
¶
Given a list of points, return a list of relevant data unit identifiers.
BaseHandlerDownloader
¶
Bases: ABC
Abstract base class for handler downloader classes. Standardizes config, data_store, and logger initialization. Extend this class for dataset-specific downloaders.
Source code in gigaspatial/handlers/base.py
BaseHandlerReader
¶
Bases: ABC
Abstract base class for handler reader classes. Provides common methods for resolving source paths and loading data. Supports resolving by country, points, geometry, GeoDataFrame, or explicit paths. Includes generic loader functions for raster and tabular data.
Source code in gigaspatial/handlers/base.py
164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 |
|
load(source, **kwargs)
¶
Load data from the given source.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
source | Union[str, List[Union[Tuple[float, float], Point]], BaseGeometry, GeoDataFrame, Path, str, List[Union[str, Path]]] | The data source (country code/name, points, geometry, paths, etc.). | required |
**kwargs | Additional parameters to pass to the loading process. | {} |
Returns:
Type | Description |
---|---|
Any | The loaded data. The type depends on the subclass implementation. |
Source code in gigaspatial/handlers/base.py
load_from_paths(source_data_path, **kwargs)
abstractmethod
¶
Abstract method to load source data from paths.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
source_data_path | List[Union[str, Path]] | List of source paths | required |
**kwargs | Additional parameters for data loading | {} |
Returns:
Type | Description |
---|---|
Any | Loaded data (DataFrame, GeoDataFrame, etc.) |
Source code in gigaspatial/handlers/base.py
resolve_by_country(country, **kwargs)
¶
Resolve source paths for a given country code/name. Uses the config's get_relevant_data_units_by_country method.
Source code in gigaspatial/handlers/base.py
resolve_by_geometry(geometry, **kwargs)
¶
Resolve source paths for a geometry or GeoDataFrame. Uses the config's get_relevant_data_units_by_geometry method.
Source code in gigaspatial/handlers/base.py
resolve_by_paths(paths, **kwargs)
¶
Return explicit paths as a list.
Source code in gigaspatial/handlers/base.py
resolve_by_points(points, **kwargs)
¶
Resolve source paths for a list of points. Uses the config's get_relevant_data_units_by_points method.
Source code in gigaspatial/handlers/base.py
resolve_source_paths(source, **kwargs)
¶
Resolve source data paths based on the type of source input.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
source | Union[str, List[Union[Tuple[float, float], Point]], BaseGeometry, GeoDataFrame, Path, str, List[Union[str, Path]]] | Can be a country code or name (str), list of points, geometry, GeoDataFrame, or explicit path(s) | required |
**kwargs | Additional parameters for path resolution | {} |
Returns:
Type | Description |
---|---|
List[Union[str, Path]] | List of resolved source paths |
Source code in gigaspatial/handlers/base.py
boundaries
¶
AdminBoundaries
¶
Bases: BaseModel
Base class for administrative boundary data with flexible fields.
Source code in gigaspatial/handlers/boundaries.py
39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 |
|
create(country_code=None, admin_level=0, data_store=None, path=None, **kwargs)
classmethod
¶
Factory method to create an AdminBoundaries instance using various data sources, depending on the provided parameters and global configuration.
Loading Logic
-
If a
data_store
is provided and either apath
is given orglobal_config.ADMIN_BOUNDARIES_DATA_DIR
is set:- If
path
is not provided butcountry_code
is, the path is constructed usingglobal_config.get_admin_path()
. - Loads boundaries from the specified data store and path.
- If
-
If only
country_code
is provided (no data_store):- Attempts to load boundaries from GeoRepo (if available).
- If GeoRepo is unavailable, attempts to load from GADM.
- If GADM fails, falls back to geoBoundaries.
- Raises an error if all sources fail.
-
If neither
country_code
nordata_store
is provided:- Raises a ValueError.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
country_code | Optional[str] | ISO country code (2 or 3 letter) or country name. | None |
admin_level | int | Administrative level (0=country, 1=state/province, etc.). | 0 |
data_store | Optional[DataStore] | Optional data store instance for loading from existing data. | None |
path | Optional[Union[str, Path]] | Optional path to data file (used with data_store). | None |
**kwargs | Additional arguments passed to the underlying creation methods. | {} |
Returns:
Name | Type | Description |
---|---|---|
AdminBoundaries | AdminBoundaries | Configured instance. |
Raises:
Type | Description |
---|---|
ValueError | If neither country_code nor (data_store, path) are provided, or if country_code lookup fails. |
RuntimeError | If all data sources fail to load boundaries. |
Examples:
Load from a data store (path auto-generated if not provided)¶
boundaries = AdminBoundaries.create(country_code="USA", admin_level=1, data_store=store)
Load from a specific file in a data store¶
boundaries = AdminBoundaries.create(data_store=store, path="data.shp")
Load from online sources (GeoRepo, GADM, geoBoundaries)¶
boundaries = AdminBoundaries.create(country_code="USA", admin_level=1)
Source code in gigaspatial/handlers/boundaries.py
294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 |
|
from_data_store(data_store, path, admin_level=0, **kwargs)
classmethod
¶
Load and create instance from internal data store.
Source code in gigaspatial/handlers/boundaries.py
from_gadm(country_code, admin_level=0, **kwargs)
classmethod
¶
Load and create instance from GADM data.
Source code in gigaspatial/handlers/boundaries.py
from_georepo(country_code=None, admin_level=0, **kwargs)
classmethod
¶
Load and create instance from GeoRepo (UNICEF) API.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
country | Country name (if using name-based lookup) | required | |
iso3 | ISO3 code (if using code-based lookup) | required | |
admin_level | int | Administrative level (0=country, 1=state, etc.) | 0 |
api_key | GeoRepo API key (optional) | required | |
email | GeoRepo user email (optional) | required | |
kwargs | Extra arguments (ignored) | {} |
Returns:
Type | Description |
---|---|
AdminBoundaries | AdminBoundaries instance |
Source code in gigaspatial/handlers/boundaries.py
get_schema_config()
classmethod
¶
to_geodataframe()
¶
Convert the AdminBoundaries to a GeoDataFrame.
Source code in gigaspatial/handlers/boundaries.py
AdminBoundary
¶
Bases: BaseModel
Base class for administrative boundary data with flexible fields.
Source code in gigaspatial/handlers/boundaries.py
ghsl
¶
CoordSystem
¶
GHSLDataConfig
dataclass
¶
Bases: BaseHandlerConfig
Source code in gigaspatial/handlers/ghsl.py
44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 |
|
__repr__()
¶
Return a string representation of the GHSL dataset configuration.
Source code in gigaspatial/handlers/ghsl.py
compute_dataset_url(tile_id=None)
¶
Compute the download URL for a GHSL dataset.
Source code in gigaspatial/handlers/ghsl.py
get_data_unit_path(unit=None, file_ext='.zip', **kwargs)
¶
Construct and return the path for the configured dataset or dataset tile.
Source code in gigaspatial/handlers/ghsl.py
get_relevant_data_units_by_geometry(geometry, **kwargs)
¶
Return intersecting tiles for a given geometry or GeoDataFrame.
Source code in gigaspatial/handlers/ghsl.py
get_relevant_data_units_by_points(points, **kwargs)
¶
Return intersecting tiles f or a list of points.
validate_configuration()
¶
Validate that the configuration is valid based on dataset availability constraints.
Specific rules:¶
Source code in gigaspatial/handlers/ghsl.py
GHSLDataDownloader
¶
Bases: BaseHandlerDownloader
A class to handle downloads of GHSL datasets.
Source code in gigaspatial/handlers/ghsl.py
308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 |
|
__init__(config, data_store=None, logger=None)
¶
Initialize the downloader.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
config | Union[GHSLDataConfig, dict[str, Union[str, int]]] | Configuration for the GHSL dataset, either as a GHSLDataConfig object or a dictionary of parameters | required |
data_store | Optional[DataStore] | Optional data storage interface. If not provided, uses LocalDataStore. | None |
logger | Optional[Logger] | Optional custom logger. If not provided, uses default logger. | None |
Source code in gigaspatial/handlers/ghsl.py
download(source, extract=True, file_pattern='.*\\.tif$', **kwargs)
¶
Download GHSL data for a specified geographic region.
The region can be defined by a country code/name, a list of points, a Shapely geometry, or a GeoDataFrame. This method identifies the relevant GHSL tiles intersecting the region and downloads the specified type of data (polygons or points) for those tiles in parallel.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
source | Union[str, List[Union[Tuple[float, float], Point]], BaseGeometry, GeoDataFrame] | Defines the geographic area for which to download data. Can be: - A string representing a country code or name. - A list of (latitude, longitude) tuples or Shapely Point objects. - A Shapely BaseGeometry object (e.g., Polygon, MultiPolygon). - A GeoDataFrame with geometry column in EPSG:4326. | required |
extract | bool | If True and the downloaded files are zips, extract their contents. Defaults to True. | True |
file_pattern | Optional[str] | Optional regex pattern to filter extracted files (if extract=True). | '.*\\.tif$' |
**kwargs | Additional keyword arguments. These will be passed down to | {} |
Returns:
Type | Description |
---|---|
List[Optional[Union[Path, List[Path]]]] | A list of local file paths for the successfully downloaded tiles. |
List[Optional[Union[Path, List[Path]]]] | Returns an empty list if no data is found for the region or if |
List[Optional[Union[Path, List[Path]]]] | all downloads fail. |
Source code in gigaspatial/handlers/ghsl.py
download_by_country(country_code, data_store=None, country_geom_path=None, extract=True, file_pattern='.*\\.tif$', **kwargs)
¶
Download GHSL data for a specific country.
This is a convenience method to download data for an entire country using its code or name.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
country_code | str | The country code (e.g., 'USA', 'GBR') or name. | required |
data_store | Optional[DataStore] | Optional instance of a | None |
country_geom_path | Optional[Union[str, Path]] | Optional path to a GeoJSON file containing the country boundary. If provided, this boundary is used instead of the default from | None |
extract | bool | If True and the downloaded files are zips, extract their contents. Defaults to True. | True |
file_pattern | Optional[str] | Optional regex pattern to filter extracted files (if extract=True). | '.*\\.tif$' |
**kwargs | Additional keyword arguments that are passed to | {} |
Returns:
Type | Description |
---|---|
List[Optional[Union[Path, List[Path]]]] | A list of local file paths for the successfully downloaded tiles |
List[Optional[Union[Path, List[Path]]]] | for the specified country. |
Source code in gigaspatial/handlers/ghsl.py
download_data_unit(tile_id, extract=True, file_pattern='.*\\.tif$', **kwargs)
¶
Downloads and optionally extracts files for a given tile.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
tile_id | str | tile ID to process. | required |
extract | bool | If True and the downloaded file is a zip, extract its contents. Defaults to True. | True |
file_pattern | Optional[str] | Optional regex pattern to filter extracted files (if extract=True). | '.*\\.tif$' |
**kwargs | Additional parameters passed to download methods | {} |
Returns:
Type | Description |
---|---|
Optional[Union[Path, List[Path]]] | Path to the downloaded file if extract=False, |
Optional[Union[Path, List[Path]]] | List of paths to the extracted files if extract=True, |
Optional[Union[Path, List[Path]]] | None on failure. |
Source code in gigaspatial/handlers/ghsl.py
330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 |
|
download_data_units(tile_ids, extract=True, file_pattern='.*\\.tif$', **kwargs)
¶
Downloads multiple tiles in parallel, with an option to extract them.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
tile_ids | List[str] | A list of tile IDs to download. | required |
extract | bool | If True and the downloaded files are zips, extract their contents. Defaults to True. | True |
file_pattern | Optional[str] | Optional regex pattern to filter extracted files (if extract=True). | '.*\\.tif$' |
**kwargs | Additional parameters passed to download methods | {} |
Returns:
Type | Description |
---|---|
List[Optional[Union[Path, List[Path]]]] | A list where each element corresponds to a tile ID and contains: |
List[Optional[Union[Path, List[Path]]]] |
|
List[Optional[Union[Path, List[Path]]]] |
|
List[Optional[Union[Path, List[Path]]]] |
|
Source code in gigaspatial/handlers/ghsl.py
GHSLDataHandler
¶
Bases: BaseHandler
Handler for GHSL (Global Human Settlement Layer) dataset.
This class provides a unified interface for downloading and loading GHSL data. It manages the lifecycle of configuration, downloading, and reading components.
Source code in gigaspatial/handlers/ghsl.py
628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 |
|
__init__(product, year=2020, resolution=100, config=None, downloader=None, reader=None, data_store=None, logger=None, **kwargs)
¶
Initialize the GHSLDataHandler.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
product | Literal['GHS_BUILT_S', 'GHS_BUILT_H_AGBH', 'GHS_BUILT_H_ANBH', 'GHS_BUILT_V', 'GHS_POP', 'GHS_SMOD'] | The GHSL product to use. Must be one of: - GHS_BUILT_S: Built-up surface - GHS_BUILT_H_AGBH: Average building height - GHS_BUILT_H_ANBH: Average number of building heights - GHS_BUILT_V: Building volume - GHS_POP: Population - GHS_SMOD: Settlement model | required |
year | int | The year of the data (default: 2020) | 2020 |
resolution | int | The resolution in meters (default: 100) | 100 |
config | Optional[GHSLDataConfig] | Optional configuration object | None |
downloader | Optional[GHSLDataDownloader] | Optional downloader instance | None |
reader | Optional[GHSLDataReader] | Optional reader instance | None |
data_store | Optional[DataStore] | Optional data store instance | None |
logger | Optional[Logger] | Optional logger instance | None |
**kwargs | Additional configuration parameters | {} |
Source code in gigaspatial/handlers/ghsl.py
create_config(data_store, logger, **kwargs)
¶
Create and return a GHSLDataConfig instance.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data_store | DataStore | The data store instance to use | required |
logger | Logger | The logger instance to use | required |
**kwargs | Additional configuration parameters | {} |
Returns:
Type | Description |
---|---|
GHSLDataConfig | Configured GHSLDataConfig instance |
Source code in gigaspatial/handlers/ghsl.py
create_downloader(config, data_store, logger, **kwargs)
¶
Create and return a GHSLDataDownloader instance.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
config | GHSLDataConfig | The configuration object | required |
data_store | DataStore | The data store instance to use | required |
logger | Logger | The logger instance to use | required |
**kwargs | Additional downloader parameters | {} |
Returns:
Type | Description |
---|---|
GHSLDataDownloader | Configured GHSLDataDownloader instance |
Source code in gigaspatial/handlers/ghsl.py
create_reader(config, data_store, logger, **kwargs)
¶
Create and return a GHSLDataReader instance.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
config | GHSLDataConfig | The configuration object | required |
data_store | DataStore | The data store instance to use | required |
logger | Logger | The logger instance to use | required |
**kwargs | Additional reader parameters | {} |
Returns:
Type | Description |
---|---|
GHSLDataReader | Configured GHSLDataReader instance |
Source code in gigaspatial/handlers/ghsl.py
load_into_dataframe(source, ensure_available=True, **kwargs)
¶
Load GHSL data into a pandas DataFrame.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
source | Union[str, List[Union[tuple, Point]], BaseGeometry, GeoDataFrame, Path, List[Union[str, Path]]] | The data source specification | required |
ensure_available | bool | If True, ensure data is downloaded before loading | True |
**kwargs | Additional parameters passed to load methods | {} |
Returns:
Type | Description |
---|---|
DataFrame | DataFrame containing the GHSL data |
Source code in gigaspatial/handlers/ghsl.py
load_into_geodataframe(source, ensure_available=True, **kwargs)
¶
Load GHSL data into a geopandas GeoDataFrame.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
source | Union[str, List[Union[tuple, Point]], BaseGeometry, GeoDataFrame, Path, List[Union[str, Path]]] | The data source specification | required |
ensure_available | bool | If True, ensure data is downloaded before loading | True |
**kwargs | Additional parameters passed to load methods | {} |
Returns:
Type | Description |
---|---|
DataFrame | GeoDataFrame containing the GHSL data |
Source code in gigaspatial/handlers/ghsl.py
GHSLDataReader
¶
Bases: BaseHandlerReader
Source code in gigaspatial/handlers/ghsl.py
__init__(config, data_store=None, logger=None)
¶
Initialize the downloader.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
config | Union[GHSLDataConfig, dict[str, Union[str, int]]] | Configuration for the GHSL dataset, either as a GHSLDataConfig object or a dictionary of parameters | required |
data_store | Optional[DataStore] | Optional data storage interface. If not provided, uses LocalDataStore. | None |
logger | Optional[Logger] | Optional custom logger. If not provided, uses default logger. | None |
Source code in gigaspatial/handlers/ghsl.py
load_from_paths(source_data_path, **kwargs)
¶
Load TifProcessors from GHSL dataset. Args: source_data_path: List of file paths to load Returns: List[TifProcessor]: List of TifProcessor objects for accessing the raster data.
Source code in gigaspatial/handlers/ghsl.py
giga
¶
GigaSchoolLocationFetcher
¶
Fetch and process school location data from the Giga School Geolocation Data API.
Source code in gigaspatial/handlers/giga.py
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 |
|
fetch_locations(**kwargs)
¶
Fetch and process school locations.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
**kwargs | Additional parameters for customization - page_size: Override default page size - sleep_time: Override default sleep time between requests - max_pages: Limit the number of pages to fetch | {} |
Returns:
Type | Description |
---|---|
DataFrame | pd.DataFrame: School locations with geospatial info. |
Source code in gigaspatial/handlers/giga.py
GigaSchoolMeasurementsFetcher
¶
Fetch and process school daily realtime connectivity measurements from the Giga API. This includes download/upload speeds, latency, and connectivity performance data.
Source code in gigaspatial/handlers/giga.py
355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 |
|
fetch_measurements(**kwargs)
¶
Fetch and process school connectivity measurements.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
**kwargs | Additional parameters for customization - page_size: Override default page size - sleep_time: Override default sleep time between requests - max_pages: Limit the number of pages to fetch - giga_id_school: Override default giga_id_school filter - start_date: Override default start_date - end_date: Override default end_date | {} |
Returns:
Type | Description |
---|---|
DataFrame | pd.DataFrame: School measurements with connectivity performance data. |
Source code in gigaspatial/handlers/giga.py
425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 |
|
get_performance_summary(df)
¶
Generate a comprehensive summary of connectivity performance metrics.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
df | DataFrame | DataFrame with measurement data | required |
Returns:
Name | Type | Description |
---|---|---|
dict | dict | Summary statistics about connectivity performance |
Source code in gigaspatial/handlers/giga.py
631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 |
|
get_school_performance_comparison(df, top_n=10)
¶
Compare performance across schools.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
df | DataFrame | DataFrame with measurement data | required |
top_n | int | Number of top/bottom schools to include | 10 |
Returns:
Name | Type | Description |
---|---|---|
dict | dict | School performance comparison |
Source code in gigaspatial/handlers/giga.py
GigaSchoolProfileFetcher
¶
Fetch and process school profile data from the Giga School Profile API. This includes connectivity information and other school details.
Source code in gigaspatial/handlers/giga.py
150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 |
|
fetch_profiles(**kwargs)
¶
Fetch and process school profiles including connectivity information.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
**kwargs | Additional parameters for customization - page_size: Override default page size - sleep_time: Override default sleep time between requests - max_pages: Limit the number of pages to fetch - giga_id_school: Override default giga_id_school filter | {} |
Returns:
Type | Description |
---|---|
DataFrame | pd.DataFrame: School profiles with connectivity and geospatial info. |
Source code in gigaspatial/handlers/giga.py
182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 |
|
get_connectivity_summary(df)
¶
Generate a summary of connectivity statistics from the fetched data.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
df | DataFrame | DataFrame with school profile data | required |
Returns:
Name | Type | Description |
---|---|---|
dict | dict | Summary statistics about connectivity |
Source code in gigaspatial/handlers/giga.py
google_open_buildings
¶
GoogleOpenBuildingsConfig
dataclass
¶
Bases: BaseHandlerConfig
Configuration for Google Open Buildings dataset files. Implements the BaseHandlerConfig interface for data unit resolution.
Source code in gigaspatial/handlers/google_open_buildings.py
24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 |
|
get_data_unit_path(unit, data_type='polygons', **kwargs)
¶
Given a tile row or tile_id, return the corresponding file path.
Source code in gigaspatial/handlers/google_open_buildings.py
get_data_unit_paths(units, data_type='polygons', **kwargs)
¶
Given data unit identifiers, return the corresponding file paths.
Source code in gigaspatial/handlers/google_open_buildings.py
get_relevant_data_units_by_geometry(geometry, **kwargs)
¶
Return intersecting tiles for a given geometry or GeoDataFrame.
Source code in gigaspatial/handlers/google_open_buildings.py
get_relevant_data_units_by_points(points, **kwargs)
¶
Return intersecting tiles for a list of points.
GoogleOpenBuildingsDownloader
¶
Bases: BaseHandlerDownloader
A class to handle downloads of Google's Open Buildings dataset.
Source code in gigaspatial/handlers/google_open_buildings.py
133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 |
|
__init__(config=None, data_store=None, logger=None)
¶
Initialize the downloader.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
config | Optional[GoogleOpenBuildingsConfig] | Optional configuration for file paths and download settings. If None, a default | None |
data_store | Optional[DataStore] | Optional instance of a | None |
logger | Optional[Logger] | Optional custom logger instance. If None, a default logger named after the module is created and used. | None |
Source code in gigaspatial/handlers/google_open_buildings.py
download(source, data_type='polygons', **kwargs)
¶
Download Google Open Buildings data for a specified geographic region.
The region can be defined by a country code/name, a list of points, a Shapely geometry, or a GeoDataFrame. This method identifies the relevant S2 tiles intersecting the region and downloads the specified type of data (polygons or points) for those tiles in parallel.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
source | Union[str, List[Union[Tuple[float, float], Point]], BaseGeometry, GeoDataFrame] | Defines the geographic area for which to download data. Can be: - A string representing a country code or name. - A list of (latitude, longitude) tuples or Shapely Point objects. - A Shapely BaseGeometry object (e.g., Polygon, MultiPolygon). - A GeoDataFrame with geometry column in EPSG:4326. | required |
data_type | Literal['polygons', 'points'] | The type of building data to download ('polygons' or 'points'). Defaults to 'polygons'. | 'polygons' |
**kwargs | Additional keyword arguments that are passed to | {} |
Returns:
Type | Description |
---|---|
List[str] | A list of local file paths for the successfully downloaded tiles. |
List[str] | Returns an empty list if no data is found for the region or if |
List[str] | all downloads fail. |
Source code in gigaspatial/handlers/google_open_buildings.py
download_by_country(country, data_type='polygons', data_store=None, country_geom_path=None)
¶
Download Google Open Buildings data for a specific country.
This is a convenience method to download data for an entire country using its code or name.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
country | str | The country code (e.g., 'USA', 'GBR') or name. | required |
data_type | Literal['polygons', 'points'] | The type of building data to download ('polygons' or 'points'). Defaults to 'polygons'. | 'polygons' |
data_store | Optional[DataStore] | Optional instance of a | None |
country_geom_path | Optional[Union[str, Path]] | Optional path to a GeoJSON file containing the country boundary. If provided, this boundary is used instead of the default from | None |
Returns:
Type | Description |
---|---|
List[str] | A list of local file paths for the successfully downloaded tiles |
List[str] | for the specified country. |
Source code in gigaspatial/handlers/google_open_buildings.py
download_data_unit(tile_info, data_type='polygons')
¶
Download data file for a single tile.
Source code in gigaspatial/handlers/google_open_buildings.py
download_data_units(tiles, data_type='polygons')
¶
Download data files for multiple tiles.
Source code in gigaspatial/handlers/google_open_buildings.py
GoogleOpenBuildingsHandler
¶
Bases: BaseHandler
Handler for Google Open Buildings dataset.
This class provides a unified interface for downloading and loading Google Open Buildings data. It manages the lifecycle of configuration, downloading, and reading components.
Source code in gigaspatial/handlers/google_open_buildings.py
342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 |
|
create_config(data_store, logger, **kwargs)
¶
Create and return a GoogleOpenBuildingsConfig instance.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data_store | DataStore | The data store instance to use | required |
logger | Logger | The logger instance to use | required |
**kwargs | Additional configuration parameters | {} |
Returns:
Type | Description |
---|---|
GoogleOpenBuildingsConfig | Configured GoogleOpenBuildingsConfig instance |
Source code in gigaspatial/handlers/google_open_buildings.py
create_downloader(config, data_store, logger, **kwargs)
¶
Create and return a GoogleOpenBuildingsDownloader instance.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
config | GoogleOpenBuildingsConfig | The configuration object | required |
data_store | DataStore | The data store instance to use | required |
logger | Logger | The logger instance to use | required |
**kwargs | Additional downloader parameters | {} |
Returns:
Type | Description |
---|---|
GoogleOpenBuildingsDownloader | Configured GoogleOpenBuildingsDownloader instance |
Source code in gigaspatial/handlers/google_open_buildings.py
create_reader(config, data_store, logger, **kwargs)
¶
Create and return a GoogleOpenBuildingsReader instance.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
config | GoogleOpenBuildingsConfig | The configuration object | required |
data_store | DataStore | The data store instance to use | required |
logger | Logger | The logger instance to use | required |
**kwargs | Additional reader parameters | {} |
Returns:
Type | Description |
---|---|
GoogleOpenBuildingsReader | Configured GoogleOpenBuildingsReader instance |
Source code in gigaspatial/handlers/google_open_buildings.py
load_points(source, ensure_available=True, **kwargs)
¶
Load point data from Google Open Buildings dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
source | Union[str, List[Union[tuple, Point]], BaseGeometry, GeoDataFrame, Path, List[Union[str, Path]]] | The data source specification | required |
ensure_available | bool | If True, ensure data is downloaded before loading | True |
**kwargs | Additional parameters passed to load methods | {} |
Returns:
Type | Description |
---|---|
GeoDataFrame | GeoDataFrame containing building point data |
Source code in gigaspatial/handlers/google_open_buildings.py
load_polygons(source, ensure_available=True, **kwargs)
¶
Load polygon data from Google Open Buildings dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
source | Union[str, List[Union[tuple, Point]], BaseGeometry, GeoDataFrame, Path, List[Union[str, Path]]] | The data source specification | required |
ensure_available | bool | If True, ensure data is downloaded before loading | True |
**kwargs | Additional parameters passed to load methods | {} |
Returns:
Type | Description |
---|---|
GeoDataFrame | GeoDataFrame containing building polygon data |
Source code in gigaspatial/handlers/google_open_buildings.py
GoogleOpenBuildingsReader
¶
Bases: BaseHandlerReader
Reader for Google Open Buildings data, supporting country, points, and geometry-based resolution.
Source code in gigaspatial/handlers/google_open_buildings.py
load_from_paths(source_data_path, **kwargs)
¶
Load building data from Google Open Buildings dataset. Args: source_data_path: List of file paths to load Returns: GeoDataFrame containing building data
Source code in gigaspatial/handlers/google_open_buildings.py
load_points(source, **kwargs)
¶
load_polygons(source, **kwargs)
¶
hdx
¶
HDXConfig
dataclass
¶
Bases: BaseHandlerConfig
Configuration for HDX data access
Source code in gigaspatial/handlers/hdx.py
29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 |
|
output_dir_path: Path
property
¶
Path to save the downloaded HDX dataset
configure_hdx()
¶
Configure HDX API if not already configured
Source code in gigaspatial/handlers/hdx.py
fetch_dataset()
¶
Get the HDX dataset
Source code in gigaspatial/handlers/hdx.py
get_data_unit_path(unit, **kwargs)
¶
Get the path for a data unit
get_dataset_resources(filter=None, exact_match=False)
¶
Get resources from the HDX dataset
Parameters:
Name | Type | Description | Default |
---|---|---|---|
filter | Optional[Dict[str, Any]] | Dictionary of key-value pairs to filter resources | None |
exact_match | bool | If True, perform exact matching. If False, use pattern matching | False |
Source code in gigaspatial/handlers/hdx.py
get_relevant_data_units(source, **kwargs)
¶
Get relevant data units based on the source type
Parameters:
Name | Type | Description | Default |
---|---|---|---|
source | Union[str, Dict] | Either a country name/code (str) or a filter dictionary | required |
**kwargs | Additional keyword arguments passed to the specific method | {} |
Returns:
Type | Description |
---|---|
List[Resource] | List of matching resources |
Source code in gigaspatial/handlers/hdx.py
get_relevant_data_units_by_country(country, key='url', **kwargs)
¶
Get relevant data units for a country
Parameters:
Name | Type | Description | Default |
---|---|---|---|
country | str | Country name or code | required |
key | str | The key to filter on in the resource data | 'url' |
patterns | List of patterns to match against the resource data | required | |
**kwargs | Additional keyword arguments | {} |
Source code in gigaspatial/handlers/hdx.py
list_resources()
¶
List all resources in the dataset directory using the data_store.
Source code in gigaspatial/handlers/hdx.py
search_datasets(query, rows=None, sort='relevance asc, metadata_modified desc', hdx_site='prod', user_agent='gigaspatial')
staticmethod
¶
Search for datasets in HDX before initializing the class.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
query | str | Search query string | required |
rows | int | Number of results per page. Defaults to all datasets (sys.maxsize). | None |
sort | str | Sort order - one of 'relevance', 'views_recent', 'views_total', 'last_modified' (default: 'relevance') | 'relevance asc, metadata_modified desc' |
hdx_site | str | HDX site to use - 'prod' or 'test' (default: 'prod') | 'prod' |
user_agent | str | User agent for HDX API requests (default: 'gigaspatial') | 'gigaspatial' |
Returns:
Type | Description |
---|---|
List[Dict] | List of dataset dictionaries containing search results |
Example
results = HDXConfig.search_datasets("population", rows=5) for dataset in results: print(f"Name: {dataset['name']}, Title: {dataset['title']}")
Source code in gigaspatial/handlers/hdx.py
HDXDownloader
¶
Bases: BaseHandlerDownloader
Downloader for HDX datasets
Source code in gigaspatial/handlers/hdx.py
download(source, **kwargs)
¶
download_data_unit(resource, **kwargs)
¶
Download a single resource
Source code in gigaspatial/handlers/hdx.py
download_data_units(resources, **kwargs)
¶
Download multiple resources sequentially
Parameters:
Name | Type | Description | Default |
---|---|---|---|
resources | List[Resource] | List of HDX Resource objects | required |
**kwargs | Additional keyword arguments | {} |
Returns:
Type | Description |
---|---|
List[str] | List of paths to downloaded files |
Source code in gigaspatial/handlers/hdx.py
HDXHandler
¶
Bases: BaseHandler
Handler for HDX datasets
Source code in gigaspatial/handlers/hdx.py
create_config(data_store, logger, **kwargs)
¶
Create and return a HDXConfig instance
Source code in gigaspatial/handlers/hdx.py
create_downloader(config, data_store, logger, **kwargs)
¶
Create and return a HDXDownloader instance
Source code in gigaspatial/handlers/hdx.py
create_reader(config, data_store, logger, **kwargs)
¶
Create and return a HDXReader instance
Source code in gigaspatial/handlers/hdx.py
HDXReader
¶
Bases: BaseHandlerReader
Reader for HDX datasets
Source code in gigaspatial/handlers/hdx.py
load_from_paths(source_data_path, **kwargs)
¶
Load data from paths
Source code in gigaspatial/handlers/hdx.py
mapbox_image
¶
MapboxImageDownloader
¶
Class to download images from Mapbox Static Images API using a specific style
Source code in gigaspatial/handlers/mapbox_image.py
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 |
|
__init__(access_token=config.MAPBOX_ACCESS_TOKEN, style_id=None, data_store=None)
¶
Initialize the downloader with Mapbox credentials
Parameters:
Name | Type | Description | Default |
---|---|---|---|
access_token | str | Mapbox access token | MAPBOX_ACCESS_TOKEN |
style_id | Optional[str] | Mapbox style ID to use for image download | None |
data_store | Optional[DataStore] | Instance of DataStore for accessing data storage | None |
Source code in gigaspatial/handlers/mapbox_image.py
download_images_by_bounds(gdf, output_dir, image_size=(512, 512), max_workers=4, image_prefix='image_')
¶
Download images for given points using the specified style
Parameters:
Name | Type | Description | Default |
---|---|---|---|
gdf_points | GeoDataFrame containing bounding box polygons | required | |
output_dir | Union[str, Path] | Directory to save images | required |
image_size | Tuple[int, int] | Tuple of (width, height) for output images | (512, 512) |
max_workers | int | Maximum number of concurrent downloads | 4 |
image_prefix | str | Prefix for output image names | 'image_' |
Source code in gigaspatial/handlers/mapbox_image.py
download_images_by_coordinates(data, res_meters_pixel, output_dir, image_size=(512, 512), max_workers=4, image_prefix='image_')
¶
Download images for given coordinates by creating bounded boxes around points
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data | Union[DataFrame, List[Tuple[float, float]]] | Either a DataFrame with either latitude/longitude columns or a geometry column or a list of (lat, lon) tuples | required |
res_meters_pixel | float | Size of the bounding box in meters (creates a square) | required |
output_dir | Union[str, Path] | Directory to save images | required |
image_size | Tuple[int, int] | Tuple of (width, height) for output images | (512, 512) |
max_workers | int | Maximum number of concurrent downloads | 4 |
image_prefix | str | Prefix for output image names | 'image_' |
Source code in gigaspatial/handlers/mapbox_image.py
download_images_by_tiles(mercator_tiles, output_dir, image_size=(512, 512), max_workers=4, image_prefix='image_')
¶
Download images for given mercator tiles using the specified style
Parameters:
Name | Type | Description | Default |
---|---|---|---|
mercator_tiles | MercatorTiles | MercatorTiles instance containing quadkeys | required |
output_dir | Union[str, Path] | Directory to save images | required |
image_size | Tuple[int, int] | Tuple of (width, height) for output images | (512, 512) |
max_workers | int | Maximum number of concurrent downloads | 4 |
image_prefix | str | Prefix for output image names | 'image_' |
Source code in gigaspatial/handlers/mapbox_image.py
maxar_image
¶
MaxarConfig
¶
Bases: BaseModel
Configuration for Maxar Image Downloader using Pydantic
Source code in gigaspatial/handlers/maxar_image.py
wms_url: str
property
¶
Generate the full WMS URL with connection string
validate_non_empty(value, field)
classmethod
¶
Ensure required credentials are provided
Source code in gigaspatial/handlers/maxar_image.py
MaxarImageDownloader
¶
Class to download images from Maxar
Source code in gigaspatial/handlers/maxar_image.py
99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 |
|
__init__(config=None, data_store=None)
¶
Initialize the downloader with Maxar config.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
config | Optional[MaxarConfig] | MaxarConfig instance containing credentials and settings | None |
data_store | Optional[DataStore] | Instance of DataStore for accessing data storage | None |
Source code in gigaspatial/handlers/maxar_image.py
download_images_by_bounds(gdf, output_dir, image_size=(512, 512), image_prefix='maxar_image_')
¶
Download images for given points using the specified style
Parameters:
Name | Type | Description | Default |
---|---|---|---|
gdf_points | GeoDataFrame containing bounding box polygons | required | |
output_dir | Union[str, Path] | Directory to save images | required |
image_size | Tuple[int, int] | Tuple of (width, height) for output images | (512, 512) |
image_prefix | str | Prefix for output image names | 'maxar_image_' |
Source code in gigaspatial/handlers/maxar_image.py
download_images_by_coordinates(data, res_meters_pixel, output_dir, image_size=(512, 512), image_prefix='maxar_image_')
¶
Download images for given coordinates by creating bounded boxes around points
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data | Union[DataFrame, List[Tuple[float, float]]] | Either a DataFrame with either latitude/longitude columns or a geometry column or a list of (lat, lon) tuples | required |
res_meters_pixel | float | resolution in meters per pixel | required |
output_dir | Union[str, Path] | Directory to save images | required |
image_size | Tuple[int, int] | Tuple of (width, height) for output images | (512, 512) |
image_prefix | str | Prefix for output image names | 'maxar_image_' |
Source code in gigaspatial/handlers/maxar_image.py
download_images_by_tiles(mercator_tiles, output_dir, image_size=(512, 512), image_prefix='maxar_image_')
¶
Download images for given mercator tiles using the specified style
Parameters:
Name | Type | Description | Default |
---|---|---|---|
mercator_tiles | MercatorTiles | MercatorTiles instance containing quadkeys | required |
output_dir | Union[str, Path] | Directory to save images | required |
image_size | Tuple[int, int] | Tuple of (width, height) for output images | (512, 512) |
image_prefix | str | Prefix for output image names | 'maxar_image_' |
Source code in gigaspatial/handlers/maxar_image.py
microsoft_global_buildings
¶
MSBuildingsConfig
dataclass
¶
Bases: BaseHandlerConfig
Configuration for Microsoft Global Buildings dataset files.
Source code in gigaspatial/handlers/microsoft_global_buildings.py
33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 |
|
__post_init__()
¶
Initialize the configuration, load tile URLs, and set up location mapping.
Source code in gigaspatial/handlers/microsoft_global_buildings.py
create_location_mapping(similarity_score_threshold=0.8)
¶
Create a mapping between the dataset's location names and ISO 3166-1 alpha-3 country codes.
This function iterates through known countries and attempts to find matching locations in the dataset based on string similarity.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
similarity_score_threshold | float | The minimum similarity score (between 0 and 1) for a dataset location to be considered a match for a country. Defaults to 0.8. | 0.8 |
Returns:
Type | Description |
---|---|
A dictionary where keys are dataset location names and values are | |
the corresponding ISO 3166-1 alpha-3 country codes. |
Source code in gigaspatial/handlers/microsoft_global_buildings.py
get_relevant_data_units_by_country(country, **kwargs)
¶
Return intersecting tiles for a given country.
get_relevant_data_units_by_geometry(geometry, **kwargs)
¶
Return intersecting tiles for a given geometry or GeoDataFrame.
Source code in gigaspatial/handlers/microsoft_global_buildings.py
get_relevant_data_units_by_points(points, **kwargs)
¶
Return intersecting tiles for a list of points.
Source code in gigaspatial/handlers/microsoft_global_buildings.py
MSBuildingsDownloader
¶
Bases: BaseHandlerDownloader
A class to handle downloads of Microsoft's Global ML Building Footprints dataset.
Source code in gigaspatial/handlers/microsoft_global_buildings.py
289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 |
|
__init__(config=None, data_store=None, logger=None)
¶
Initialize the downloader.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
config | Optional[MSBuildingsConfig] | Optional configuration for customizing download behavior and file paths. If None, a default | None |
data_store | Optional[DataStore] | Optional instance of a | None |
logger | Optional[Logger] | Optional custom logger instance. If None, a default logger named after the module is created and used. | None |
Source code in gigaspatial/handlers/microsoft_global_buildings.py
download(source, **kwargs)
¶
Download Microsoft Global ML Building Footprints data for a specified geographic region.
The region can be defined by a country, a list of points, a Shapely geometry, or a GeoDataFrame. This method identifies the relevant data tiles intersecting the region and downloads them in parallel.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
source | Union[str, List[Union[Tuple[float, float], Point]], BaseGeometry, GeoDataFrame] | Defines the geographic area for which to download data. Can be: - A string representing a country code or name. - A list of (latitude, longitude) tuples or Shapely Point objects. - A Shapely BaseGeometry object (e.g., Polygon, MultiPolygon). - A GeoDataFrame with a geometry column in EPSG:4326. | required |
**kwargs | Additional parameters passed to data unit resolution methods | {} |
Returns:
Type | Description |
---|---|
List[str] | A list of local file paths for the successfully downloaded tiles. |
List[str] | Returns an empty list if no data is found for the region or if |
List[str] | all downloads fail. |
Source code in gigaspatial/handlers/microsoft_global_buildings.py
download_by_country(country, data_store=None, country_geom_path=None)
¶
Download Microsoft Global ML Building Footprints data for a specific country.
This is a convenience method to download data for an entire country using its code or name.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
country | str | The country code (e.g., 'USA', 'GBR') or name. | required |
data_store | Optional[DataStore] | Optional instance of a | None |
country_geom_path | Optional[Union[str, Path]] | Optional path to a GeoJSON file containing the country boundary. If provided, this boundary is used instead of the default from | None |
Returns:
Type | Description |
---|---|
List[str] | A list of local file paths for the successfully downloaded tiles. |
List[str] | Returns an empty list if no data is found for the country or if |
List[str] | all downloads fail. |
Source code in gigaspatial/handlers/microsoft_global_buildings.py
download_data_unit(tile_info, **kwargs)
¶
Download data file for a single tile.
Source code in gigaspatial/handlers/microsoft_global_buildings.py
download_data_units(tiles, **kwargs)
¶
Download data files for multiple tiles.
Source code in gigaspatial/handlers/microsoft_global_buildings.py
MSBuildingsHandler
¶
Bases: BaseHandler
Handler for Microsoft Global Buildings dataset.
This class provides a unified interface for downloading and loading Microsoft Global Buildings data. It manages the lifecycle of configuration, downloading, and reading components.
Source code in gigaspatial/handlers/microsoft_global_buildings.py
create_config(data_store, logger, **kwargs)
¶
Create and return a MSBuildingsConfig instance.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data_store | DataStore | The data store instance to use | required |
logger | Logger | The logger instance to use | required |
**kwargs | Additional configuration parameters | {} |
Returns:
Type | Description |
---|---|
MSBuildingsConfig | Configured MSBuildingsConfig instance |
Source code in gigaspatial/handlers/microsoft_global_buildings.py
create_downloader(config, data_store, logger, **kwargs)
¶
Create and return a MSBuildingsDownloader instance.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
config | MSBuildingsConfig | The configuration object | required |
data_store | DataStore | The data store instance to use | required |
logger | Logger | The logger instance to use | required |
**kwargs | Additional downloader parameters | {} |
Returns:
Type | Description |
---|---|
MSBuildingsDownloader | Configured MSBuildingsDownloader instance |
Source code in gigaspatial/handlers/microsoft_global_buildings.py
create_reader(config, data_store, logger, **kwargs)
¶
Create and return a MSBuildingsReader instance.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
config | MSBuildingsConfig | The configuration object | required |
data_store | DataStore | The data store instance to use | required |
logger | Logger | The logger instance to use | required |
**kwargs | Additional reader parameters | {} |
Returns:
Type | Description |
---|---|
MSBuildingsReader | Configured MSBuildingsReader instance |
Source code in gigaspatial/handlers/microsoft_global_buildings.py
MSBuildingsReader
¶
Bases: BaseHandlerReader
Reader for Microsoft Global Buildings data, supporting country, points, and geometry-based resolution.
Source code in gigaspatial/handlers/microsoft_global_buildings.py
load_from_paths(source_data_path, **kwargs)
¶
Load building data from Microsoft Buildings dataset. Args: source_data_path: List of file paths to load Returns: GeoDataFrame containing building data
Source code in gigaspatial/handlers/microsoft_global_buildings.py
opencellid
¶
OpenCellIDConfig
¶
Bases: BaseModel
Configuration for OpenCellID data access
Source code in gigaspatial/handlers/opencellid.py
output_file_path: Path
property
¶
Path to save the downloaded OpenCellID data
OpenCellIDDownloader
¶
Downloader for OpenCellID data
Source code in gigaspatial/handlers/opencellid.py
71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 |
|
download_and_process()
¶
Download and process OpenCellID data for the configured country
Source code in gigaspatial/handlers/opencellid.py
156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 |
|
from_country(country, api_token=global_config.OPENCELLID_ACCESS_TOKEN, **kwargs)
classmethod
¶
Create a downloader for a specific country
Source code in gigaspatial/handlers/opencellid.py
get_download_links()
¶
Get download links for the country from OpenCellID website
Source code in gigaspatial/handlers/opencellid.py
OpenCellIDReader
¶
Reader for OpenCellID data
Source code in gigaspatial/handlers/opencellid.py
read_data()
¶
Read OpenCellID data for the specified country
Source code in gigaspatial/handlers/opencellid.py
to_geodataframe()
¶
Convert OpenCellID data to a GeoDataFrame
osm
¶
OSMLocationFetcher
dataclass
¶
A class to fetch and process location data from OpenStreetMap using the Overpass API.
This class supports fetching various OSM location types including amenities, buildings, shops, and other POI categories.
Source code in gigaspatial/handlers/osm.py
14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 |
|
__post_init__()
¶
Validate inputs, normalize location_types, and set up logging.
Source code in gigaspatial/handlers/osm.py
fetch_locations(since_year=None, handle_duplicates='separate')
¶
Fetch and process OSM locations.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
since_year | int | Filter for locations added/modified since this year. | None |
handle_duplicates | str | How to handle objects matching multiple categories: - 'separate': Create separate entries for each category (default) - 'combine': Use a single entry with a list of matching categories - 'primary': Keep only the first matching category | 'separate' |
Returns:
Type | Description |
---|---|
DataFrame | pd.DataFrame: Processed OSM locations |
Source code in gigaspatial/handlers/osm.py
214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 |
|
overture
¶
OvertureAmenityFetcher
¶
A class to fetch and process amenity locations from Overture.
Source code in gigaspatial/handlers/overture.py
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 |
|
__post_init__()
¶
Validate inputs and set up logging.
Source code in gigaspatial/handlers/overture.py
fetch_locations(match_pattern=False, **kwargs)
¶
Fetch and process amenity locations.
Source code in gigaspatial/handlers/overture.py
rwi
¶
RWIConfig
dataclass
¶
Bases: HDXConfig
Configuration for Relative Wealth Index data access
Source code in gigaspatial/handlers/rwi.py
get_relevant_data_units_by_country(country, **kwargs)
¶
Get relevant data units for a country, optionally filtering for latest version
Source code in gigaspatial/handlers/rwi.py
RWIDownloader
¶
Bases: HDXDownloader
Specialized downloader for the Relative Wealth Index dataset from HDX
Source code in gigaspatial/handlers/rwi.py
RWIHandler
¶
Bases: HDXHandler
Handler for Relative Wealth Index dataset
Source code in gigaspatial/handlers/rwi.py
create_config(data_store, logger, **kwargs)
¶
Create and return a RWIConfig instance
create_downloader(config, data_store, logger, **kwargs)
¶
Create and return a RWIDownloader instance
Source code in gigaspatial/handlers/rwi.py
create_reader(config, data_store, logger, **kwargs)
¶
Create and return a RWIReader instance
Source code in gigaspatial/handlers/rwi.py
RWIReader
¶
Bases: HDXReader
Specialized reader for the Relative Wealth Index dataset from HDX
Source code in gigaspatial/handlers/rwi.py
unicef_georepo
¶
GeoRepoClient
¶
A client for interacting with the GeoRepo API.
GeoRepo is a platform for managing and accessing geospatial administrative boundary data. This client provides methods to search, retrieve, and work with modules, datasets, views, and administrative entities.
Attributes:
Name | Type | Description |
---|---|---|
base_url | str | The base URL for the GeoRepo API |
api_key | str | The API key for authentication |
email | str | The email address associated with the API key |
headers | dict | HTTP headers used for API requests |
Source code in gigaspatial/handlers/unicef_georepo.py
6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 |
|
__init__(api_key=None, email=None)
¶
Initialize the GeoRepo client.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
api_key | str | GeoRepo API key. If not provided, will use the GEOREPO_API_KEY environment variable from config. | None |
email | str | Email address associated with the API key. If not provided, will use the GEOREPO_USER_EMAIL environment variable from config. | None |
Raises:
Type | Description |
---|---|
ValueError | If api_key or email is not provided and cannot be found in environment variables. |
Source code in gigaspatial/handlers/unicef_georepo.py
check_connection()
¶
Checks if the API connection is valid by making a simple request.
Returns:
Name | Type | Description |
---|---|---|
bool | True if the connection is valid, False otherwise. |
Source code in gigaspatial/handlers/unicef_georepo.py
find_country_by_iso3(view_uuid, iso3_code)
¶
Find a country entity using its ISO3 country code.
This method searches through all level-0 (country) entities to find one that matches the provided ISO3 code. It checks both the entity's Ucode and any external codes stored in the ext_codes field.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
view_uuid | str | The UUID of the view to search within. | required |
iso3_code | str | The ISO3 country code to search for (e.g., 'USA', 'KEN', 'BRA'). | required |
Returns:
Type | Description |
---|---|
dict or None: Entity information dictionary for the matching country if found, including properties like name, ucode, admin_level, etc. Returns None if no matching country is found. |
Note
This method handles pagination automatically to search through all available countries in the dataset, which may involve multiple API calls.
Raises:
Type | Description |
---|---|
HTTPError | If the API request fails or view_uuid is invalid. |
Source code in gigaspatial/handlers/unicef_georepo.py
get_admin_boundaries(view_uuid, admin_level=None, geom='full_geom', format='geojson')
¶
Get administrative boundaries for a specific level or all levels.
This is a convenience method that can retrieve boundaries for a single administrative level or attempt to fetch all available levels.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
view_uuid | str | The UUID of the view to query. | required |
admin_level | int | Administrative level to retrieve (0=country, 1=region, etc.). If None, attempts to fetch all levels. | None |
geom | str | Geometry inclusion level. Options: - "no_geom": No geometry data - "centroid": Only centroid points - "full_geom": Complete boundary geometries Defaults to "full_geom". | 'full_geom' |
format | str | Response format ("json" or "geojson"). Defaults to "geojson". | 'geojson' |
Returns:
Name | Type | Description |
---|---|---|
dict | JSON/GeoJSON response containing administrative boundaries in the specified format. For GeoJSON, returns a FeatureCollection. |
Raises:
Type | Description |
---|---|
HTTPError | If the API request fails or parameters are invalid. |
Source code in gigaspatial/handlers/unicef_georepo.py
get_dataset_details(dataset_uuid)
¶
Get detailed information about a specific dataset.
This includes metadata about the dataset and information about available administrative levels (e.g., country, province, district).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dataset_uuid | str | The UUID of the dataset to query. | required |
Returns:
Name | Type | Description |
---|---|---|
dict | JSON response containing dataset details including: - Basic metadata (name, description, etc.) - Available administrative levels and their properties - Temporal information and data sources |
Raises:
Type | Description |
---|---|
HTTPError | If the API request fails or dataset_uuid is invalid. |
Source code in gigaspatial/handlers/unicef_georepo.py
get_entity_by_ucode(ucode, geom='full_geom', format='geojson')
¶
Get detailed information about a specific entity using its Ucode.
A Ucode (Universal Code) is a unique identifier for geographic entities within the GeoRepo system, typically in the format "ISO3_LEVEL_NAME".
Parameters:
Name | Type | Description | Default |
---|---|---|---|
ucode | str | The unique code identifier for the entity. | required |
geom | str | Geometry inclusion level. Options: - "no_geom": No geometry data - "centroid": Only centroid points - "full_geom": Complete boundary geometries Defaults to "full_geom". | 'full_geom' |
format | str | Response format ("json" or "geojson"). Defaults to "geojson". | 'geojson' |
Returns:
Name | Type | Description |
---|---|---|
dict | JSON/GeoJSON response containing entity details including geometry, properties, administrative level, and metadata. |
Raises:
Type | Description |
---|---|
HTTPError | If the API request fails or ucode is invalid. |
Source code in gigaspatial/handlers/unicef_georepo.py
get_vector_tiles_url(view_info)
¶
Generate an authenticated URL for accessing vector tiles.
Vector tiles are used for efficient map rendering and can be consumed by mapping libraries like Mapbox GL JS or OpenLayers.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
view_info | dict | Dictionary containing view information that must include a 'vector_tiles' key with the base vector tiles URL. | required |
Returns:
Name | Type | Description |
---|---|---|
str | Fully authenticated vector tiles URL with API key and user email parameters appended for access control. |
Raises:
Type | Description |
---|---|
ValueError | If 'vector_tiles' key is not found in view_info. |
Source code in gigaspatial/handlers/unicef_georepo.py
list_datasets_by_module(module_uuid)
¶
List all datasets within a specific module.
A dataset represents a collection of related geographic entities, such as administrative boundaries for a specific country or region.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
module_uuid | str | The UUID of the module to query. | required |
Returns:
Name | Type | Description |
---|---|---|
dict | JSON response containing a list of datasets with their metadata. Each dataset includes 'uuid', 'name', 'description', creation date, etc. |
Raises:
Type | Description |
---|---|
HTTPError | If the API request fails or module_uuid is invalid. |
Source code in gigaspatial/handlers/unicef_georepo.py
list_entities_by_admin_level(view_uuid, admin_level, geom='no_geom', format='json', page=1, page_size=50)
¶
List entities at a specific administrative level within a view.
Administrative levels typically follow a hierarchy: - Level 0: Countries - Level 1: States/Provinces/Regions - Level 2: Districts/Counties - Level 3: Sub-districts/Municipalities - And so on...
Parameters:
Name | Type | Description | Default |
---|---|---|---|
view_uuid | str | The UUID of the view to query. | required |
admin_level | int | The administrative level to retrieve (0, 1, 2, etc.). | required |
geom | str | Geometry inclusion level. Options: - "no_geom": No geometry data - "centroid": Only centroid points - "full_geom": Complete boundary geometries Defaults to "no_geom". | 'no_geom' |
format | str | Response format ("json" or "geojson"). Defaults to "json". | 'json' |
page | int | Page number for pagination. Defaults to 1. | 1 |
page_size | int | Number of results per page. Defaults to 50. | 50 |
Returns:
Name | Type | Description |
---|---|---|
tuple | A tuple containing: - dict: JSON/GeoJSON response with entity data - dict: Metadata with pagination info (page, total_page, total_count) |
Raises:
Type | Description |
---|---|
HTTPError | If the API request fails or parameters are invalid. |
Source code in gigaspatial/handlers/unicef_georepo.py
list_entity_children(view_uuid, entity_ucode, geom='no_geom', format='json')
¶
List direct children of an entity in the administrative hierarchy.
For example, if given a country entity, this will return its states/provinces. If given a state entity, this will return its districts/counties.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
view_uuid | str | The UUID of the view containing the entity. | required |
entity_ucode | str | The Ucode of the parent entity. | required |
geom | str | Geometry inclusion level. Options: - "no_geom": No geometry data - "centroid": Only centroid points - "full_geom": Complete boundary geometries Defaults to "no_geom". | 'no_geom' |
format | str | Response format ("json" or "geojson"). Defaults to "json". | 'json' |
Returns:
Name | Type | Description |
---|---|---|
dict | JSON/GeoJSON response containing list of child entities with their properties and optional geometry data. |
Raises:
Type | Description |
---|---|
HTTPError | If the API request fails or parameters are invalid. |
Source code in gigaspatial/handlers/unicef_georepo.py
list_modules()
¶
List all available modules in GeoRepo.
A module is a top-level organizational unit that contains datasets. Examples include "Admin Boundaries", "Health Facilities", etc.
Returns:
Name | Type | Description |
---|---|---|
dict | JSON response containing a list of modules with their metadata. Each module includes 'uuid', 'name', 'description', and other properties. |
Raises:
Type | Description |
---|---|
HTTPError | If the API request fails. |
Source code in gigaspatial/handlers/unicef_georepo.py
list_views_by_dataset(dataset_uuid, page=1, page_size=50)
¶
List views for a dataset with pagination support.
A view represents a specific version or subset of a dataset. Views may be tagged as 'latest' or represent different time periods.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dataset_uuid | str | The UUID of the dataset to query. | required |
page | int | Page number for pagination. Defaults to 1. | 1 |
page_size | int | Number of results per page. Defaults to 50. | 50 |
Returns:
Name | Type | Description |
---|---|---|
dict | JSON response containing paginated list of views with metadata. Includes 'results', 'total_page', 'current_page', and 'count' fields. Each view includes 'uuid', 'name', 'tags', and other properties. |
Raises:
Type | Description |
---|---|
HTTPError | If the API request fails or dataset_uuid is invalid. |
Source code in gigaspatial/handlers/unicef_georepo.py
search_entities_by_name(view_uuid, name, page=1, page_size=50)
¶
Search for entities by name using fuzzy matching.
This performs a similarity-based search to find entities whose names match or are similar to the provided search term.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
view_uuid | str | The UUID of the view to search within. | required |
name | str | The name or partial name to search for. | required |
page | int | Page number for pagination. Defaults to 1. | 1 |
page_size | int | Number of results per page. Defaults to 50. | 50 |
Returns:
Name | Type | Description |
---|---|---|
dict | JSON response containing paginated search results with matching entities and their similarity scores. |
Raises:
Type | Description |
---|---|
HTTPError | If the API request fails or parameters are invalid. |
Source code in gigaspatial/handlers/unicef_georepo.py
find_admin_boundaries_module()
¶
Find and return the UUID of the Admin Boundaries module.
This is a convenience function that searches through all available modules to locate the one named "Admin Boundaries", which typically contains administrative boundary datasets.
Returns:
Name | Type | Description |
---|---|---|
str | The UUID of the Admin Boundaries module. |
Raises:
Type | Description |
---|---|
ValueError | If the Admin Boundaries module is not found. |
Source code in gigaspatial/handlers/unicef_georepo.py
get_country_boundaries_by_iso3(iso3_code, client=None, admin_level=None)
¶
Get administrative boundaries for a specific country using its ISO3 code.
This function provides a high-level interface to retrieve country boundaries by automatically finding the appropriate module, dataset, and view, then fetching the requested administrative boundaries.
The function will: 1. Find the Admin Boundaries module 2. Locate a global dataset within that module 3. Find the latest view of that dataset 4. Search for the country using the ISO3 code 5. Look for a country-specific view if available 6. Retrieve boundaries at the specified admin level or all levels
Parameters:
Name | Type | Description | Default |
---|---|---|---|
iso3_code | str | The ISO3 country code (e.g., 'USA', 'KEN', 'BRA'). | required |
admin_level | int | The administrative level to retrieve: - 0: Country level - 1: State/Province/Region level - 2: District/County level - 3: Sub-district/Municipality level - etc. If None, retrieves all available administrative levels. | None |
Returns:
Name | Type | Description |
---|---|---|
dict | A GeoJSON FeatureCollection containing the requested boundaries. Each feature includes geometry and properties for the administrative unit. |
Raises:
Type | Description |
---|---|
ValueError | If the Admin Boundaries module, datasets, views, or country cannot be found. |
HTTPError | If any API requests fail. |
Note
This function may make multiple API calls and can take some time for countries with many administrative units. It handles pagination automatically and attempts to use country-specific views when available for better performance.
Example
Get all administrative levels for Kenya¶
boundaries = get_country_boundaries_by_iso3('KEN')
Get only province-level boundaries for Kenya¶
provinces = get_country_boundaries_by_iso3('KEN', admin_level=1)
Source code in gigaspatial/handlers/unicef_georepo.py
485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 |
|
worldpop
¶
WorldPopConfig
¶
Bases: BaseModel
Source code in gigaspatial/handlers/worldpop.py
19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 |
|
dataset_path: Path
property
¶
Construct and return the path for the configured dataset.
dataset_url: str
property
¶
Get the URL for the configured dataset. The URL is computed on first access and then cached for subsequent calls.
validate_configuration()
¶
Validate that the configuration is valid based on dataset availability constraints.
Specific rules: - Post-2020 data is only available at 1km resolution with UN adjustment - School age population data is only available for 2020 at 1km resolution
Source code in gigaspatial/handlers/worldpop.py
WorldPopDownloader
¶
A class to handle downloads of WorldPop datasets.
Source code in gigaspatial/handlers/worldpop.py
__init__(config, data_store=None, logger=None)
¶
Initialize the downloader.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
config | Union[WorldPopConfig, dict[str, Union[str, int]]] | Configuration for the WorldPop dataset, either as a WorldPopConfig object or a dictionary of parameters | required |
data_store | Optional[DataStore] | Optional data storage interface. If not provided, uses LocalDataStore. | None |
logger | Optional[Logger] | Optional custom logger. If not provided, uses default logger. | None |
Source code in gigaspatial/handlers/worldpop.py
download_dataset()
¶
Download the configured dataset to the provided output path.
Source code in gigaspatial/handlers/worldpop.py
from_country_year(country, year, **kwargs)
classmethod
¶
Create a downloader instance from country and year.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
country | str | Country code or name | required |
year | int | Year of the dataset | required |
**kwargs | Additional parameters for WorldPopConfig or the downloader | {} |
Source code in gigaspatial/handlers/worldpop.py
read_dataset(data_store, path, compression=None, **kwargs)
¶
Read data from various file formats stored in both local and cloud-based storage.
Parameters:¶
data_store : DataStore Instance of DataStore for accessing data storage. path : str, Path Path to the file in data storage. **kwargs : dict Additional arguments passed to the specific reader function.
Returns:¶
pandas.DataFrame or geopandas.GeoDataFrame The data read from the file.
Raises:¶
FileNotFoundError If the file doesn't exist in blob storage. ValueError If the file type is unsupported or if there's an error reading the file.
Source code in gigaspatial/core/io/readers.py
65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 |
|
read_datasets(data_store, paths, **kwargs)
¶
Read multiple datasets from data storage at once.
Parameters:¶
data_store : DataStore Instance of DataStore for accessing data storage. paths : list of str Paths to files in data storage. **kwargs : dict Additional arguments passed to read_dataset.
Returns:¶
dict Dictionary mapping paths to their corresponding DataFrames/GeoDataFrames.
Source code in gigaspatial/core/io/readers.py
read_gzipped_json_or_csv(file_path, data_store)
¶
Reads a gzipped file, attempting to parse it as JSON (lines=True) or CSV.
Source code in gigaspatial/core/io/readers.py
read_kmz(file_obj, **kwargs)
¶
Helper function to read KMZ files and return a GeoDataFrame.
Source code in gigaspatial/core/io/readers.py
write_dataset(data, data_store, path, **kwargs)
¶
Write DataFrame or GeoDataFrame to various file formats in Azure Blob Storage.
Parameters:¶
data : pandas.DataFrame or geopandas.GeoDataFrame The data to write to blob storage. data_store : DataStore Instance of DataStore for accessing data storage. path : str Path where the file will be written in data storage. **kwargs : dict Additional arguments passed to the specific writer function.
Raises:¶
ValueError If the file type is unsupported or if there's an error writing the file. TypeError If input data is not a DataFrame or GeoDataFrame.
Source code in gigaspatial/core/io/writers.py
write_datasets(data_dict, data_store, **kwargs)
¶
Write multiple datasets to data storage at once.
Parameters:¶
data_dict : dict Dictionary mapping paths to DataFrames/GeoDataFrames. data_store : DataStore Instance of DataStore for accessing data storage. **kwargs : dict Additional arguments passed to write_dataset.
Raises:¶
ValueError If there are any errors writing the datasets.