-
场景一样例说明:计算出某一个区域的ATM机数据量。
指定某一区域:ST_PolygonFromEnvelope;
计算出哪些ATM是在这个区域范围内的:ST_Contains(); -
场景二样例说明:GeoSpark + SparkGraphX + TimeStamp 可以
计算出每一笔交易的具体在哪个时间点的流向,而且最终去了什么地方。 -
ST_Contains函数说明:
ST_Contains 获取两个几何对象,如果第一个对象完全包含第二个对象,则返回 1(Oracle 和 SQLite)或 t (PostgreSQL);否则返回 0(Oracle 和 SQLite)或 f (PostgreSQL)
步骤如下:
- 测试数据为shapefile类型的文件
- 测试代码
import findspark
findspark.init()
from geospark.core.formatMapper.shapefileParser import ShapefileReader
from geospark.utils import KryoSerializer, GeoSparkKryoRegistrator
from pyspark import StorageLevel
from geospark.core.SpatialRDD import PointRDD
from geospark.core.enums import FileDataSplitter
from pyspark.sql import SparkSession
from geospark.register import GeoSparkRegistrator
from geospark.utils.adapter import Adapter
spark = SparkSession.builder\
.config("spark.serializer", KryoSerializer.getName)\
.config("spark.kryo.registrator", GeoSparkKryoRegistrator.getName).\
getOrCreate()
GeoSparkRegistrator.registerAll(spark)
input_location = r"D:\pycharm\pythonProject\GeoSpark\functions\taxi\txdata"
shapeRDD = ShapefileReader.readToGeometryRDD(spark.sparkContext,input_location)
df = Adapter.toDf(shapeRDD,spark)
df.createOrReplaceTempView("p_view")
rsltDf = spark.sql(""" select *,ST_Transform(ST_PointFromText(p_view.geometry,"WKT"),\"epsg:3857\",\"epsg:4236\") as longlat from p_view """.strip())
rsltDf.show(truncate=False)
print("rslt:",rsltDf.count())
rsltDf.createOrReplaceTempView("rslt_view")
r2 = spark.sql("""
select * from rslt_view where ST_Contains(ST_PolygonFromEnvelope(-70.0,30.0,-80.0,40.0),rslt_view.longlat)=1
""")
r2.show(truncate=False)
print("r2:",r2.count())
- 2.1 其中关键代码是:
r2 = spark.sql("""
select * from rslt_view where ST_Contains(ST_PolygonFromEnvelope(-70.0,30.0,-80.0,40.0),rslt_view.longlat)=1
""")
- 测试结果:
+---------------------------------------------+----+---------+---------+------+-------------------+----------------------------------------------+
|geometry |OID_|ORIG_ID |DEST_ID |AMOUNT|DATE_TIME |longlat |
+---------------------------------------------+----+---------+---------+------+-------------------+----------------------------------------------+
|POINT (-9301700.8034 5105596.222599998) |0 |258615475|437160417|8820 |2016-11-11 07:13:05|POINT (-83.55026400749416 41.63421823355603) |
|POINT (-9358740.9105 4037627.3462999985) |0 |704942357|510424955|4410 |2016-10-30 14:37:54|POINT (-84.06352152841099 34.070404819499196) |
|POINT (-9264386.5101 3682406.9626) |0 |001203959|404353450|8719 |2016-01-25 07:19:24|POINT (-83.2160684443583 31.386102079091344) |
|POINT (-9649006.4827 3951012.173100002) |0 |529097289|907207476|1372 |2016-10-23 17:35:57|POINT (-86.67132004895393 33.42352249150436) |
|POINT (-13180016.2029 4048140.6855999976) |0 |367009578|433332387|6182 |2016-06-12 16:46:49|POINT (-118.39485492578717 34.150156057427154)|
|POINT (-9630761.2182 4323872.919) |0 |400371343|874866791|6115 |2016-02-01 08:42:29|POINT (-86.50716135029704 36.17376931800233) |
|POINT (-8082997.2820999995 5188356.3825) |0 |969465806|897087904|9986 |2016-03-25 07:50:55|POINT (-72.60145509838317 42.18668862834412) |
|POINT (-8240558.8894 4888681.432899997) |0 |700460031|153246913|4916 |2016-02-20 07:12:31|POINT (-74.01723811036628 40.16055497253425) |
|POINT (-10854451.8527 4223728.012699999) |0 |651043044|676686033|2436 |2016-03-10 10:44:53|POINT (-97.50103376340422 35.44474772295976) |
|POINT (-9287006.6306 4606173.8500000015) |0 |464894120|465924488|8329 |2016-08-08 17:08:43|POINT (-83.41865851599233 38.19396835831678) |
|POINT (-8230317.496200001 4974816.362899996) |0 |369292057|829097328|7027 |2016-11-18 17:25:39|POINT (-73.92515285656063 40.74936267965346) |
|POINT (-13189033.0816 4072121.144299999) |0 |171982572|109247577|2613 |2016-09-03 09:30:19|POINT (-118.47586016147616 34.32827492016201) |
|POINT (-11851585.059500001 3734204.825000003)|0 |415102893|404170119|6796 |2016-12-30 14:42:47|POINT (-106.45979278427104 31.78365143151068) |
|POINT (-9248913.1009 5236898.5506) |0 |430608075|534452230|3408 |2016-02-09 17:50:48|POINT (-83.07590121227808 42.50981473259561) |
|POINT (-8422299.09 5071907.6696000025) |0 |543350607|469325922|8182 |2016-12-17 14:07:31|POINT (-75.6497927871166 41.407014403970145) |
|POINT (-8233445.573899999 4966253.920500003) |0 |400713444|423284456|1837 |2016-02-28 09:11:52|POINT (-73.95326278439005 40.69106344313047) |
|POINT (-9097474.0656 3551353.9376000017) |0 |405111054|400396185|8824 |2016-01-20 07:57:50|POINT (-81.71661953409682 30.375557412066826) |
|POINT (-13620574.219700001 5713090.735399999)|0 |543690234|691363014|3611 |2016-12-18 07:30:11|POINT (-122.35256948181386 45.585971687795634)|
|POINT (-10635297.1712 3459459.2875000015) |0 |162287502|989286406|1624 |2016-05-17 09:23:35|POINT (-95.53249791616715 29.661413022111155) |
|POINT (-13164542.7937 4030950.637599997) |0 |705176547|617526611|5482 |2016-09-23 07:48:18|POINT (-118.25583916701737 34.02224032714601) |
+---------------------------------------------+----+---------+---------+------+-------------------+----------------------------------------------+
only showing top 20 rows
rslt: 10000000 # 所有数据量为:10000000
+--------------------------------------------+----+---------+---------+------+-------------------+---------------------------------------------+
|geometry |OID_|ORIG_ID |DEST_ID |AMOUNT|DATE_TIME |longlat |
+--------------------------------------------+----+---------+---------+------+-------------------+---------------------------------------------+
|POINT (-8487265.1449 4427140.189000003) |0 |613155274|699462503|2150 |2016-03-24 16:25:46|POINT (-76.23399099122358 36.91841857015605) |
|POINT (-8260306.967 4860232.953000002) |0 |383733987|412284108|3775 |2016-09-08 08:46:36|POINT (-74.19467651224136 39.964963526525544)|
|POINT (-8575296.5982 4707672.228) |0 |898291273|617746094|8791 |2016-09-19 17:59:04|POINT (-77.02462314927627 38.906644699794334)|
|POINT (-8375121.889799999 4859869.875299998)|0 |401808157|423696288|6411 |2016-04-09 17:17:54|POINT (-75.2261530204051 39.96254251910555) |
|POINT (-8531692.7536 4763023.847000003) |0 |419723724|427577953|7585 |2016-02-27 17:19:05|POINT (-76.63284540943305 39.29252832749463) |
|POINT (-8365292.378799999 4861685.396399997)|0 |811922731|898313044|1776 |2016-04-05 17:20:05|POINT (-75.13784477335199 39.97503612885506) |
|POINT (-8473628.5072 4414018.9070999995) |0 |123868556|157287061|9041 |2016-08-23 17:24:10|POINT (-76.11149219213411 36.824106492183766)|
|POINT (-8809946.9528 4496957.214299999) |0 |001185305|404565153|5357 |2016-03-28 07:42:15|POINT (-79.13286429908266 37.41844119060501) |
|POINT (-8509128.2929 4421989.928999998) |0 |658296942|874773799|8854 |2016-03-21 10:15:39|POINT (-76.4304097738809 36.88143110953037) |
|POINT (-8346846.7392 4795480.815099999) |0 |001120328|415175872|7285 |2016-09-24 12:49:16|POINT (-74.97219080270794 39.51771051706394) |
|POINT (-8630767.1004 4521611.184500001) |0 |590320537|959535261|6446 |2016-03-17 17:00:34|POINT (-77.52311494596034 37.59403466623528) |
|POINT (-8360616.960200001 4865186.658500001)|0 |607672268|691944540|5161 |2016-10-31 10:00:51|POINT (-75.0958385113615 39.999133594882416) |
|POINT (-8493944.3143 4401288.7848000005) |0 |868680315|433263778|8571 |2016-08-07 17:40:45|POINT (-76.29401578545901 36.73251603895104) |
|POINT (-8365960.295700001 4858911.414099999)|0 |707126404|874878432|3399 |2016-05-21 10:51:26|POINT (-75.14384768674081 39.9559360502999) |
|POINT (-8811483.1618 4495990.198299997) |0 |434241136|659874820|7149 |2016-02-08 17:58:46|POINT (-79.14666619508805 37.41154187709866) |
|POINT (-8900416.303 4811524.744000003) |0 |425587112|674192653|4558 |2016-03-03 07:40:30|POINT (-79.9453785344068 39.629183057539535) |
|POINT (-8897889.3505 3867807.3281999975) |0 |988401493|427555710|8827 |2016-11-18 17:59:36|POINT (-79.923379062258 32.79698387586238) |
|POINT (-8575218.6745 4710776.7553) |0 |596831800|232957951|6249 |2016-01-27 17:42:01|POINT (-77.0239204812753 38.928345370662974) |
|POINT (-8701310.2617 4748392.690700002) |0 |424685607|435420343|9593 |2016-03-24 17:48:59|POINT (-78.15668003548083 39.19083730448294) |
|POINT (-8681027.8505 4220258.071999997) |0 |968573402|512933045|6564 |2016-03-07 17:52:22|POINT (-77.97488118958793 35.41817997952977) |
+--------------------------------------------+----+---------+---------+------+-------------------+---------------------------------------------+
only showing top 20 rows
r2: 598243 # 包含在区域范围内的数据为:598243
Process finished with exit code 0
网友评论