- 空间投影转经纬度坐标系
- 测试数据文件类型,shapefile
-
使用的中间函数有ST_PointFromText(将文本类的WKT数据文件转成geometry类型的数据)
-
测试代码如下:
import findspark
findspark.init()
from geospark.core.formatMapper.shapefileParser import ShapefileReader
from geospark.utils import KryoSerializer, GeoSparkKryoRegistrator
from pyspark import StorageLevel
from geospark.core.SpatialRDD import PointRDD
from geospark.core.enums import FileDataSplitter
from pyspark.sql import SparkSession
from geospark.register import GeoSparkRegistrator
from geospark.utils.adapter import Adapter
spark = SparkSession.builder\
.config("spark.serializer", KryoSerializer.getName)\
.config("spark.kryo.registrator", GeoSparkKryoRegistrator.getName).\
getOrCreate()
GeoSparkRegistrator.registerAll(spark)
input_location = r"D:\pycharm\pythonProject\GeoSpark\functions\taxi\txdata"
shapeRDD = ShapefileReader.readToGeometryRDD(spark.sparkContext,input_location)
df = Adapter.toDf(shapeRDD,spark)
df.show(truncate=False)
df.printSchema()
# shapeRDD.saveAsWKT("D:\pycharm\pythonProject\GeoSpark\functions\taxi\rsltWKT")
df.createOrReplaceTempView("p_view")
rsltDf = spark.sql(""" select *,ST_Transform(ST_PointFromText(p_view.geometry,"WKT"),\"epsg:3857\",\"epsg:4236\") from p_view """.strip())
rsltDf.show(truncate=False)
rsltDf.printSchema()
- 测试结果如下:
源数据文件打印:
+---------------------------------------------+----+---------+---------+------+-------------------+
|geometry |OID_|ORIG_ID |DEST_ID |AMOUNT|DATE_TIME |
+---------------------------------------------+----+---------+---------+------+-------------------+
|POINT (-9301700.8034 5105596.222599998) |0 |258615475|437160417|8820 |2016-11-11 07:13:05|
|POINT (-9358740.9105 4037627.3462999985) |0 |704942357|510424955|4410 |2016-10-30 14:37:54|
|POINT (-9264386.5101 3682406.9626) |0 |001203959|404353450|8719 |2016-01-25 07:19:24|
|POINT (-9649006.4827 3951012.173100002) |0 |529097289|907207476|1372 |2016-10-23 17:35:57|
|POINT (-13180016.2029 4048140.6855999976) |0 |367009578|433332387|6182 |2016-06-12 16:46:49|
|POINT (-9630761.2182 4323872.919) |0 |400371343|874866791|6115 |2016-02-01 08:42:29|
|POINT (-8082997.2820999995 5188356.3825) |0 |969465806|897087904|9986 |2016-03-25 07:50:55|
|POINT (-8240558.8894 4888681.432899997) |0 |700460031|153246913|4916 |2016-02-20 07:12:31|
|POINT (-10854451.8527 4223728.012699999) |0 |651043044|676686033|2436 |2016-03-10 10:44:53|
|POINT (-9287006.6306 4606173.8500000015) |0 |464894120|465924488|8329 |2016-08-08 17:08:43|
|POINT (-8230317.496200001 4974816.362899996) |0 |369292057|829097328|7027 |2016-11-18 17:25:39|
|POINT (-13189033.0816 4072121.144299999) |0 |171982572|109247577|2613 |2016-09-03 09:30:19|
|POINT (-11851585.059500001 3734204.825000003)|0 |415102893|404170119|6796 |2016-12-30 14:42:47|
|POINT (-9248913.1009 5236898.5506) |0 |430608075|534452230|3408 |2016-02-09 17:50:48|
|POINT (-8422299.09 5071907.6696000025) |0 |543350607|469325922|8182 |2016-12-17 14:07:31|
|POINT (-8233445.573899999 4966253.920500003) |0 |400713444|423284456|1837 |2016-02-28 09:11:52|
|POINT (-9097474.0656 3551353.9376000017) |0 |405111054|400396185|8824 |2016-01-20 07:57:50|
|POINT (-13620574.219700001 5713090.735399999)|0 |543690234|691363014|3611 |2016-12-18 07:30:11|
|POINT (-10635297.1712 3459459.2875000015) |0 |162287502|989286406|1624 |2016-05-17 09:23:35|
|POINT (-13164542.7937 4030950.637599997) |0 |705176547|617526611|5482 |2016-09-23 07:48:18|
+---------------------------------------------+----+---------+---------+------+-------------------+
only showing top 20 rows
root
|-- geometry: string (nullable = true)
|-- OID_: string (nullable = true)
|-- ORIG_ID: string (nullable = true)
|-- DEST_ID: string (nullable = true)
|-- AMOUNT: string (nullable = true)
|-- DATE_TIME: string (nullable = true)
转换后的数据打印:
+---------------------------------------------+----+---------+---------+------+-------------------+-------------------------------------------------------------------+
|geometry |OID_|ORIG_ID |DEST_ID |AMOUNT|DATE_TIME |st_transform(st_pointfromtext(geometry, WKT), epsg:3857, epsg:4236)|
+---------------------------------------------+----+---------+---------+------+-------------------+-------------------------------------------------------------------+
|POINT (-9301700.8034 5105596.222599998) |0 |258615475|437160417|8820 |2016-11-11 07:13:05|POINT (-83.55026400749416 41.63421823355603) |
|POINT (-9358740.9105 4037627.3462999985) |0 |704942357|510424955|4410 |2016-10-30 14:37:54|POINT (-84.06352152841099 34.070404819499196) |
|POINT (-9264386.5101 3682406.9626) |0 |001203959|404353450|8719 |2016-01-25 07:19:24|POINT (-83.2160684443583 31.386102079091344) |
|POINT (-9649006.4827 3951012.173100002) |0 |529097289|907207476|1372 |2016-10-23 17:35:57|POINT (-86.67132004895393 33.42352249150436) |
|POINT (-13180016.2029 4048140.6855999976) |0 |367009578|433332387|6182 |2016-06-12 16:46:49|POINT (-118.39485492578717 34.150156057427154) |
|POINT (-9630761.2182 4323872.919) |0 |400371343|874866791|6115 |2016-02-01 08:42:29|POINT (-86.50716135029704 36.17376931800233) |
|POINT (-8082997.2820999995 5188356.3825) |0 |969465806|897087904|9986 |2016-03-25 07:50:55|POINT (-72.60145509838317 42.18668862834412) |
|POINT (-8240558.8894 4888681.432899997) |0 |700460031|153246913|4916 |2016-02-20 07:12:31|POINT (-74.01723811036628 40.16055497253425) |
|POINT (-10854451.8527 4223728.012699999) |0 |651043044|676686033|2436 |2016-03-10 10:44:53|POINT (-97.50103376340422 35.44474772295976) |
|POINT (-9287006.6306 4606173.8500000015) |0 |464894120|465924488|8329 |2016-08-08 17:08:43|POINT (-83.41865851599233 38.19396835831678) |
|POINT (-8230317.496200001 4974816.362899996) |0 |369292057|829097328|7027 |2016-11-18 17:25:39|POINT (-73.92515285656063 40.74936267965346) |
|POINT (-13189033.0816 4072121.144299999) |0 |171982572|109247577|2613 |2016-09-03 09:30:19|POINT (-118.47586016147616 34.32827492016201) |
|POINT (-11851585.059500001 3734204.825000003)|0 |415102893|404170119|6796 |2016-12-30 14:42:47|POINT (-106.45979278427104 31.78365143151068) |
|POINT (-9248913.1009 5236898.5506) |0 |430608075|534452230|3408 |2016-02-09 17:50:48|POINT (-83.07590121227808 42.50981473259561) |
|POINT (-8422299.09 5071907.6696000025) |0 |543350607|469325922|8182 |2016-12-17 14:07:31|POINT (-75.6497927871166 41.407014403970145) |
|POINT (-8233445.573899999 4966253.920500003) |0 |400713444|423284456|1837 |2016-02-28 09:11:52|POINT (-73.95326278439005 40.69106344313047) |
|POINT (-9097474.0656 3551353.9376000017) |0 |405111054|400396185|8824 |2016-01-20 07:57:50|POINT (-81.71661953409682 30.375557412066826) |
|POINT (-13620574.219700001 5713090.735399999)|0 |543690234|691363014|3611 |2016-12-18 07:30:11|POINT (-122.35256948181386 45.585971687795634) |
|POINT (-10635297.1712 3459459.2875000015) |0 |162287502|989286406|1624 |2016-05-17 09:23:35|POINT (-95.53249791616715 29.661413022111155) |
|POINT (-13164542.7937 4030950.637599997) |0 |705176547|617526611|5482 |2016-09-23 07:48:18|POINT (-118.25583916701737 34.02224032714601) |
+---------------------------------------------+----+---------+---------+------+-------------------+-------------------------------------------------------------------+
- 在使用ST_PointFromText 注意传入的类型。一般有TSV,CSV,WKT,WKB等等,
epsg:3857 是空间坐标,epsg:4326经纬度坐标
网友评论