Parquet

Published by onesixx on

https://parquet.apache.org/ : Apache Parquet

https://arrow.apache.org/docs/python/parquet.html#

표준화된 오픈소스 컬럼형식의 저장소 포멧

padas.dataFrame <–> parquet

import pandas as pd

# conda install -c conda-forge fastparquet

# pip install fastparquet
# pip install pyarrow
import pyarrow as pa
import pyarrow.parquet as pq


df = pd.read_csv('dataset/st.csv')
#isinstance(trn_df, pd.DataFrame)  # True

### pd.DataFrame to parquet
df.to_parquet('dataset/st.parquet')
# pq.write_table(pa.Table.from_pandas(trn_df), 'dataset/st1.parquet')

train_df = pd.read_parquet('dataset/st.parquet')
# df = pq.read_table('./dataset/st1.parquet').to_pandas()
Categories: Uncategorized

onesixx

Blog Owner

Subscribe
Notify of
guest

0 Comments
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x