CSV 处理
读取 csv 记录
将标准 csv 记录,读取到csv::StringRecord
:一种弱类型数据表示形式,这需要有效的 UTF-8 行。或者,csv::ByteRecord
,它不会对 UTF-8 有任何假设。
extern crate csv; use csv::Error; fn main() -> Result<(), Error> { let csv = "year,make,model,description 1948,Porsche,356,Luxury sports car 1967,Ford,Mustang fastback 1967,American car"; let mut reader = csv::Reader::from_reader(csv.as_bytes()); for record in reader.records() { let record = record?; println!( "In {}, {} built the {} model. It is a {}.", &record[0], &record[1], &record[2], &record[3] ); } Ok(()) }
serde
将数据反序列化,为强类型结构。见csv::Reader::deserialize
方法。
extern crate csv; # #[macro_use] # extern crate error_chain; #[macro_use] extern crate serde_derive; # error_chain! { # foreign_links { # Reader(csv::Error); # } # } # #[derive(Deserialize)] struct Record { year: u16, make: String, model: String, description: String, } fn run() -> Result<()> { let csv = "year,make,model,description 1948,Porsche,356,Luxury sports car 1967,Ford,Mustang fastback 1967,American car"; let mut reader = csv::Reader::from_reader(csv.as_bytes()); for record in reader.deserialize() { let record: Record = record?; println!( "In {}, {} built the {} model. It is a {}.", record.year, record.make, record.model, record.description ); } Ok(()) } # # quick_main!(run);
读取具有不同分隔符的 csv 记录
用一个 tab(分隔符) delimiter
读取 csv 记录。
extern crate csv; use csv::Error; #[macro_use] extern crate serde_derive; #[derive(Debug, Deserialize)] struct Record { name: String, place: String, #[serde(deserialize_with = "csv::invalid_option")] id: Option<u64>, } use csv::ReaderBuilder; fn main() -> Result<(), Error> { let data = "name\tplace\tid Mark\tMelbourne\t46 Ashley\tZurich\t92"; let mut reader = ReaderBuilder::new().delimiter(b'\t').from_reader(data.as_bytes()); for result in reader.deserialize::<Record>() { println!("{:?}", result?); } Ok(()) }
筛选与断言匹配的 csv 记录
只返回data
中字段(field)行,匹配query
的。
# #[macro_use] # extern crate error_chain; extern crate csv; use std::io; # # error_chain!{ # foreign_links { # Io(std::io::Error); # CsvError(csv::Error); # } # } fn run() -> Result<()> { let query = "CA"; let data = "\ City,State,Population,Latitude,Longitude Kenai,AK,7610,60.5544444,-151.2583333 Oakman,AL,,33.7133333,-87.3886111 Sandfort,AL,,32.3380556,-85.2233333 West Hollywood,CA,37031,34.0900000,-118.3608333"; let mut rdr = csv::ReaderBuilder::new().from_reader(data.as_bytes()); let mut wtr = csv::Writer::from_writer(io::stdout()); wtr.write_record(rdr.headers()?)?; for result in rdr.records() { let record = result?; if record.iter().any(|field| field == query) { wtr.write_record(&record)?; } } wtr.flush()?; Ok(()) } # # quick_main!(run);
_免责声明:本示例改编自the csv crate tutorial*.
使用 serde
处理无效的 csv 数据
csv 文件通常包含无效数据。对于这些情况,csv
箱子提供一个自定义反序列化程序,csv::invalid_option
,自动将无效数据转换为 None
值。
extern crate csv; use csv::Error; #[macro_use] extern crate serde_derive; #[derive(Debug, Deserialize)] struct Record { name: String, place: String, #[serde(deserialize_with = "csv::invalid_option")] id: Option<u64>, } fn main() -> Result<(), Error> { let data = "name,place,id mark,sydney,46.5 ashley,zurich,92 akshat,delhi,37 alisha,colombo,xyz"; let mut rdr = csv::Reader::from_reader(data.as_bytes()); for result in rdr.deserialize() { let record: Record = result?; println!("{:?}", record); } Ok(()) }
将记录,序列化为 csv
这个例子:演示了如何序列化一个 Rust 元组。csv::writer
支持从 Rust 类型,到 csv 记录的自动序列化。write_record
只写入包含字符串数据的简单记录。对于具有更复杂值(如数字、浮点数和选项)的数据,请使用serialize
。因为 csv 编写器使用内部缓冲区,所以在做完之后,始终要显式flush
。
# #[macro_use] # extern crate error_chain; extern crate csv; use std::io; # # error_chain! { # foreign_links { # CSVError(csv::Error); # IOError(std::io::Error); # } # } fn run() -> Result<()> { let mut wtr = csv::Writer::from_writer(io::stdout()); wtr.write_record(&["Name", "Place", "ID"])?; wtr.serialize(("Mark", "Sydney", 87))?; wtr.serialize(("Ashley", "Dublin", 32))?; wtr.serialize(("Akshat", "Delhi", 11))?; wtr.flush()?; Ok(()) } # # quick_main!(run);
使用 serde 将记录序列化为 csv
下面的示例,演示如何使用serde箱子。
# #[macro_use] # extern crate error_chain; extern crate csv; #[macro_use] extern crate serde_derive; use std::io; # # error_chain! { # foreign_links { # IOError(std::io::Error); # CSVError(csv::Error); # } # } #[derive(Serialize)] struct Record<'a> { name: &'a str, place: &'a str, id: u64, } fn run() -> Result<()> { let mut wtr = csv::Writer::from_writer(io::stdout()); let rec1 = Record { name: "Mark", place: "Melbourne", id: 56}; let rec2 = Record { name: "Ashley", place: "Sydney", id: 64}; let rec3 = Record { name: "Akshat", place: "Delhi", id: 98}; wtr.serialize(rec1)?; wtr.serialize(rec2)?; wtr.serialize(rec3)?; wtr.flush()?; Ok(()) } # # quick_main!(run);
转换 csv 列
将包含颜色名称和十六进制颜色的 csv 文件,转换为具有颜色名称和 RGB 颜色的文件。利用csv箱子,读取和写入 csv 文件,以及serde对文件的一行,在字节之间进行反序列化和序列化。
见csv::Reader::deserialize
,serde::Deserialize
和std::str::FromStr
extern crate csv; # #[macro_use] # extern crate error_chain; #[macro_use] extern crate serde_derive; extern crate serde; use csv::{Reader, Writer}; use serde::{de, Deserialize, Deserializer}; use std::str::FromStr; # # error_chain! { # foreign_links { # CsvError(csv::Error); # ParseInt(std::num::ParseIntError); # CsvInnerError(csv::IntoInnerError<Writer<Vec<u8>>>); # IO(std::fmt::Error); # UTF8(std::string::FromUtf8Error); # } # } #[derive(Debug)] struct HexColor { red: u8, green: u8, blue: u8, } #[derive(Debug, Deserialize)] struct Row { color_name: String, color: HexColor, } impl FromStr for HexColor { type Err = Error; fn from_str(hex_color: &str) -> std::result::Result<Self, Self::Err> { let trimmed = hex_color.trim_matches('#'); if trimmed.len() != 6 { Err("Invalid length of hex string".into()) } else { Ok(HexColor { red: u8::from_str_radix(&trimmed[..2], 16)?, green: u8::from_str_radix(&trimmed[2..4], 16)?, blue: u8::from_str_radix(&trimmed[4..6], 16)?, }) } } } impl<'de> Deserialize<'de> for HexColor { fn deserialize<D>(deserializer: D) -> std::result::Result<Self, D::Error> where D: Deserializer<'de>, { let s = String::deserialize(deserializer)?; FromStr::from_str(&s).map_err(de::Error::custom) } } fn run() -> Result<()> { let data = "color_name,color red,#ff0000 green,#00ff00 blue,#0000FF periwinkle,#ccccff magenta,#ff00ff" .to_owned(); let mut out = Writer::from_writer(vec![]); let mut reader = Reader::from_reader(data.as_bytes()); for result in reader.deserialize::<Row>() { let res = result?; out.serialize(( res.color_name, res.color.red, res.color.green, res.color.blue, ))?; } let written = String::from_utf8(out.into_inner()?)?; assert_eq!(Some("magenta,255,0,255"), written.lines().last()); println!("{}", written); Ok(()) } # # quick_main!(run);