前几天做了一些数据整理,获得一批歌曲歌手的数据,为了算法服务需要将所有数据按固定格式整理到一个为文件中,下面是我的做法,写法可能会偏稚嫩,有建议的话请一定提出,有帮助的话大家一起进步。
1.文件内数据格式和需要整理的格式
所有数据都为以下格式,一条一行
{"album_pic": "http://y.gtimg.cn/music/photo_new/T002R300x300M000004QnEHc3zjC7J.jpg", "public_time": "2018-10-17", "track_name": "年少有为", "album_id": "004QnEHc3zjC7J", "id": "004DXFlC0nsTCZ", "album_name": "耳朵", "singer_name": [{"name": "李荣浩", "id": "000aHmbL2aPXWH"}], "hot": 1817643}
需要整理为(并去重)
//SONG(1),SINGER(2),ALBUM(4)
年少有为=4,李荣浩=2,耳朵=1
若遇到某两个名字相同的情况,如,歌曲名为“飞云之下”,专辑名也为“飞云之下”,则将两个代表数字进行位于运算,即:飞云之下=3
2.代码
package com.example.run;
import com.alibaba.fastjson.JSON;
import com.alibaba.fastjson.JSONArray;
import com.alibaba.fastjson.JSONObject;
import java.io.*;
import java.util.HashMap;
import java.util.Map;
import java.util.Set;
/**
* Author: RL
* Date: 2020/3/25
* Time: 19:01
* Description: No Description
*/
public class Temp1 {
static Map outMap1 = new HashMap<>();
static Map outMap2 = new HashMap<>();
static Map outMap4 = new HashMap<>();
static Map outMap = new HashMap<>();
public static void main(String[] args) throws IOException {
String encoding = "UTF-8";
String dir = "D:\\music-data\\ftpu";
File[] files = new File(dir).listFiles();
for (File file : files) {//循环文件夹中的文件
if (file.isFile() && file.exists()) {//判断文件是否存在
importFile(file, encoding); //将文件中的数据读取出来,并存放进集合中
System.out.println(file.getName());
} else {
System.out.println("文件不存在,请检查文件位置!");
}
}
outFile();
}
public static void importFile(File file, String encoding) throws IOException {
InputStreamReader read = null;//考虑到编码格式
try {
read = new InputStreamReader(new FileInputStream(file), encoding); //输入流
} catch (UnsupportedEncodingException e) {
e.printStackTrace();
} catch (FileNotFoundException e) {
e.printStackTrace();
}
BufferedReader bufferedReader = new BufferedReader(read);
String lineTxt = null;
while ((lineTxt = bufferedReader.readLine()) != null) {//读取文件内容
//SONG(1),SINGER(2),ALBUM(4)
JSONObject object = JSON.parseObject(lineTxt);
JSONArray singer_name = (JSONArray) object.get("singer_name");
for (int i = 0; i < singer_name.length(); i++) {
String singer = (String) ((JSONObject) singer_name.get(i)).get("name");
outMap2.put(singer, "2");
}
String song = (String) object.get("track_name");
outMap1.put(song, "1");
String alname = (String) object.get("album_name");
outMap4.put(alname, "4");
}
read.close(); //输入流关闭
}
public static void outFile() throws IOException {
OutputStreamWriter pw = null;//定义一个流
pw = new OutputStreamWriter(new FileOutputStream("D:\\musicData.txt"), "UTF-8");//确认流的输出文件和编码格式
System.out.println("开始合并去重:");
for (int i = 0; i < outMap4.size(); i++) {
Set setData = outMap4.keySet();
if (outMap2.containsKey(setData.toArray()[i])) {
outMap2.put((String) setData.toArray()[i], "2|4");
outMap4.remove(setData.toArray()[i]);
}
}
for (int i = 0; i < outMap1.size(); i++) {
Set setData = outMap1.keySet();
if (outMap4.containsKey(setData.toArray()[i])) {
outMap4.put((String) setData.toArray()[i], "1|4");
outMap1.remove(setData.toArray()[i]);
}
}
for (int i = 0; i < outMap1.size(); i++) {
Set setData = outMap1.keySet();
if (outMap2.containsKey(setData.toArray()[i])) {
outMap1.remove(setData.toArray()[i]);
if ("2".equals(outMap2.get(setData.toArray()[i]))) {
outMap2.put((String) setData.toArray()[i], "1|2");
} else {
outMap2.put((String) setData.toArray()[i], "1|2|4");
}
}
}
outMap.putAll(outMap1);
outMap.putAll(outMap2);
outMap.putAll(outMap4);
pw.write(outMap.toString());
pw.close();
}
}
3.运行main函数就完成啦~
中间业务逻辑繁琐了一些,但是整体框架可以举一反三使用
网友评论