2019/6/16

julia Data Visulization


分成三個部分: Basic Plots, Vega, Gadfly,討論如何在 julia 進行 data visulization


Basic plots


使用 PyPlot,這是以 python matplotlib.pyplot module 提供的功能


使用前要先安裝 matplotlib


python -m pip install matplotlib

在 julia 安裝 PyPlot


 Pkg.add("PyPlot")

用以下指令在 julia 測試 PyPlot


using PyPlot
x = 1:100
y = rand(100)
p = PyPlot.plot(x,y)
xlabel("x")
ylabel("y")
title("basic plot")
grid("true")

會得到這樣的圖形結果



另一個例子


using PyPlot

x = range(0, stop=4pi, length=1000)
y = cos.(pi .+ sin.(x))

xlabel("x-axis")
ylabel("y-axis")
title("using sin and cos functions")

plot(x, y, color="red")


XKCD 是一種 casual-style, handwritten graph mode


using PyPlot

x = [1:1:10;]
y = ones(10)

for i = 1:1:10
    y[i] = pi + i*i
end

xkcd()
xlabel("x-axis")
ylabel("y-axis")
title("XKCD")
plot(x,y)


bar chart


using PyPlot

x = [10,20,30,40,50]
y = [2,4,6,8,10]
xlabel("x-axis")
ylabel("y-axis")
title("Vertical bar graph")
bar(x, y, color="red")


horizontal bar chart


clf()
x = [10,20,30,40,50]
y = [2,4,6,8,10]
title("Horizontal bar graph")
xlabel("x-axis")
ylabel("y-axis")
barh(x,y,color="red")

2D histogram


clf()
x = rand(1000)
y = rand(1000)
xlabel("x-axis")
ylabel("y-axis")
title("2D Histograph")
hist2D(x, y, bins=50)

pie chart


clf()
labels = ["Fruits";"Vegetables";"Wheat"]
colors = ["Orange";"Blue";"Red"]
sizes = [25;40;35]
explode = zeros(length(sizes))
fig = figure("piechart", figsize=(10,10))
p = pie(sizes, labels=labels, shadow=true, startangle=90, colors = colors)
title("Pie charts")


Scatter chart


clf()
fig = figure("scatterplot", figsize = (10,10))
x = rand(50)
y = rand(50)
areas = 1000*rand(50);
scatter(x, y, s=areas, alpha=0.5)
xlabel("x-axis")
ylabel("y-axis")
title("Scatter Plot")



PyPlot 的 3D plot 是使用 surf(x, y, z, facecolors=colors)


參數 說明
X,Y,Z Data values as 2D arrays
rstride Array row stride (step size)
cstride Array column stride (step size)
rcount Use at most this many rows, defaults to 50
ccount Use at most this many columns, defaults to 50
color Color of the surface patches
cmap A colormap for the surface patches.
facecolors Face colors for the individual patches
norm An instance of Normalize to map values to colors
vmin Minimum value to map
vmax Maximum value to map
shade Whether to shade the facecolors

using PyPlot

clf()
a = range(0.0, stop=2pi, length=500)
b = range(0.0, stop=2pi, length=500)

len_a = length(a)
len_b = length(b)

x = ones(len_a, len_b)
y = ones(len_a, len_b)
z = ones(len_a, len_b)

for i=1:len_a
    for j=1:len_b
        x[i,j] = sin(a[i])
        y[i,j] = cos(a[i])
        z[i,j] = sin(b[j])
    end
end

colors = rand(len_a, len_b, 3)
fig = figure()
surf(x, y, z, facecolors=colors)
fig[:canvas][:draw]()


Gadfly


這是一個圖形的 library,可以輸出圖片為 SVG, PNG, PostScript, PDF,也可用 IJulia 運作,跟 DataFrames 緊密整合,提供 pan, zoom, toggle 的功能。執行 Gadfly.plot 後,browser 會打開一個 html 檔案,裡面是 svg 圖片。


Pkg.add("Gadfly")
using Gadfly
Gadfly.plot(x = rand(10), y=rand(10))


# 折線圖
Gadfly.plot(x = rand(10),y=rand(10), Geom.point, Geom.line)

Gadfly.plot(x=1:10, y=[10^n for n in rand(10)], Scale.y_sqrt, Geom.point, Geom.smooth, Guide.xlabel("x"), Guide.ylabel("y"), Guide.title("Graph with labels"))




Plotting DataFrames with Gadfly

使用 RDatasets (有一些範例資料) 產生 DataFrame for the plot function


折線圖


using RDatasets
Gadfly.plot(dataset("datasets", "iris"),
        x="SepalLength",
        y="SepalWidth",
        Geom.line)


Point Plot


Gadfly.plot(dataset("datasets", "iris"),
        x="SepalLength",
        y="SepalWidth",
        Geom.point)


plot a graph between SepalLength and SepalWidth



histogram


Gadfly.plot(x=randn(4000), Geom.histogram(bincount=100))


preceding showcased histogram


Gadfly.plot(dataset("mlmRev", "Gcsemv"),
        x = "Course", color="Gender", Geom.histogram)


References


Learning Julia

沒有留言:

張貼留言