BIF: Built-In Function

BIF 是 erlang 內建的函數，例如 tuple_to_list/1 可將 tuple 轉換為 list，time/0 可取得當前的時間，大部分的 BIF 都屬於 erlang 模組，但因為 BIF 是自動匯入的，所以使用 tuple_to_list(...) 不需要寫成 erlang:tuple_to_list(...)。

http://www.erlang.org/doc/man/erlang.html 可找到 module:erlang 的文件。

函數規格 @spec

在 function 中，參數與回傳值的資料型別並不是清楚地寫在 function 的定義上，我們需要一個方式，來告訴使用這個函數的progarmmer，該怎麼使用它。erlang 社群開發了一種記號法，但這個記號法並不是 erlang 程式碼的一部分，而只是一種寫文件的工具。

這個記號法只能用在文件上，在程式碼中，會用 %% 將該行視為註解。通常會這樣寫 %% @spec

-module(math)
-export([fac/1]).

%% @spec fac(int()) -> int().

fac(0) -> 1;
fac(N) -> N * fac(N-1).

在使用此型別記號法時，要定義兩件事：型別與函數的規格。

定義型別

名稱為 typeName 的型別會寫成 typeName()

預先定義的型別是：

any(): 指任何 erlang 的資料型別，term() 是 any() 的別名
atom(), binary(), float(), function(), integer(), pid(), port(), reference(): erlang 的基本資料型別
bool(): atom(true 或 false)
char(): integer() 的子集合，代表字元
iolist(): 遞迴地定義為 [char() | binary() | iolist()]，通常用來產生高效率的字元輸出
tuple()
list(L): 是 [L] 的別名
nil(): 就是 []
string(): list(char()) 的別名
depp_string(): 遞迴地定義為 [char()|deep_string()]
none(): 沒有資料型別，用在不會產生回傳值的函數，例如無窮的接收迴圈，表示此函數不會返回

使用者自己定義型別可以寫成
@type newType() = TypeExpression

範例
@type onOff() = on|off.
@type person() = {person, name(), age()}.
@type people() = [person()].
@type name() = {firstname, string()}.
@type age() = integer().

指定函數的輸入與輸出型別

寫法為
@spec fuinctionName(T1, T2, ..., Tn) -> Tret
T1, T2, ..., Tn 是參數的型別， Tret 是回傳值的資料型別

每個 T 都有三種可能的形式

TypeVar
型別變數，這代表未知型別（跟 erlang 的變數無關）
TypeVar::Type
型別變數後面跟著一個型別
Type
型別表示式

範例

@spec file:open(FileName, Mode) -> {ok, Handle} | {error, Why}.
@spec file:read_line(Handle) -> {ok, Line} | eof.

file:open/2 意思是，要開啟 FileName，會取得回傳值 {ok, Handle} 或是 {error, Why}
FileName 跟 Mode 是型別變數，但我們不知道它確切的型別是什麼。

範例

@spec lists:map(fun(A)->B, [A]) -> [B].
@spec lists:filter(fun(X) -> bool(), [X]) -> [X].

範例

@spec file:open(FileName::string(), [mode()]) -> {ok, Handle::file_handle()} | {error, Why::string()}.
@type mode() = read|write|compressed|raw|binary| ...

範例

@spec file:open(string(), Modes) -> {ok, Handle} | {error, string()}
    Handle() = file_handle(),
    Modes = [Mode],
    Mode = read|write|compressed|raw|binary| ...

範例

@spec file:open(string(), [mode()]) -> {ok,file_handle()} | error().
@type error() = {error, string()}.
@type mode() = read|write|compressed|raw|binary| ...

在文件中的定義

在文件裡，我們會省略 @spec

file:open(FileName, Mode) -> {ok, Handle} | {error, Why}.
    根據 Mode 開啟檔案 FileName。Mode 為....
file:read_line(Handle) -> {ok, Line} | eof.
    從開啟的檔案 Handle 中讀取一行資料，傳出 Line，檔案結尾則傳出 eof

使用到 @spec 的工具

EDoc
erlang 的文件產生器，類似 javadoc，以 @name, @doc, @type, @author等annotation，將文件嵌入到 source code 的註解中

Dialyzer
是靜態分析工具，可找出程式的型別錯誤、無法執行到的程式碼、無必要的測試...

binary

binary 可儲存大量的原始資料，以 << 與 >> 將一串整數或字串包夾在中間，整數必須要在 0 ~ 255 中間，<<"cat">> 是 <<99,97,116>> 的速寫，如果 binary 裡面都是可列印的字元，shell 就會自動當作字串列印出來。

1> <<5,10,20>>.
<<5,10,20>>
2> <<"cat">>.
<<"cat">>
3> <<99,97,116>>.
<<"cat">>

處理 binary 的 BIF

@spec list_to_binary(IoList) -> binary()
list_to_binary 以 IoList 內的整數與二元，產生 binary，IoList 是一個 list，裡面的元素是 0~255 整數、binary或 IoList。

4> Bin1 = <<1,2,3>>.
<<1,2,3>>
5> Bin2 = <<4,5>>.
<<4,5>>
6> Bin3 = <<6>>.
<<6>>
7> list_to_binary([Bin1, 1, [2,3,Bin2], 4|Bin3]).
<<1,2,3,1,2,3,4,5,4,6>>

@spec split_binary(Bin, Pos) -> {Bin1, Bin2}
在 Pos 位置，將 Bin 分割為兩個部份

8> split_binary(<<1,2,3,4,5,6,7,8,9,10>>, 3).
{<<1,2,3>>,<<4,5,6,7,8,9,10>>}

@spec term_to_binary(Term) -> Bin
將任意的 erlang term 轉換為 binary，將 term 轉成 binary 之後，就可以儲存到檔案、傳送到網路上，而且可以重建出原始的 term。

@spec binary_to_term(Bin) -> Term
term_to_binary的相反，可將 Bin 轉換為 Term

9> B = term_to_binary({binaries, "are", useful}).
<<131,104,3,100,0,8,98,105,110,97,114,105,101,115,107,0,3,
  97,114,101,100,0,6,117,115,101,102,117,108>>
10> binary_to_term(B).
{binaries,"are",useful}

@spec size(Bin) -> Int
這會傳出記憶體中的位元組個數

11> size(B).
29
12> size(<<1,2,3,4>>).
4

位元語法

這是 pattern matching 的擴充，用來取出並打包位元資料中的個別位元或位元序列。這個功能很適合用來撰寫低階程式碼，或是網路通訊程式，這是 erlang 最強的功能。

在變數 M 中，X 佔用 3 個位元，Y 佔用 7 個位元，Z 佔用 6 個位元。

14> X=2.
2
15> Y=10.
10
16> Z=15.
15
17> M = <<X:3, Y:7, Z:6>>.
<<66,143>>

範例：16 位元 RGB

如果 16位元的 RGB 顏色，R 佔用5位元，G 佔用 6 個位元, B 佔用 5 個位元。

19> Red = 2.
2
20> Green=61.
61
21> Blue=20.
20
22> Mem = <<Red:5,Green:6,Blue:5>>.
<<23,180>>
23> <<R1:5,G1:6,B1:5>> = Mem.
<<23,180>>
24> R1.
2
25> G1.
61
26> B1.
20

位元語法表示式

位元語法表示格式如下，每個 Ei 是四種形式之一。位元的總個數必須要是 8 的倍數。建構 binary 時，Value 必須要是已繫結的變數、字串、整數、浮點數、binary。而用在 pattern matching 時，Valued 可以是已繫結或未繫結的變數、字串、整數、浮點數、binary。

TypeSpecifierList 是型別指定子清單，一個用減號分隔項目的list，End-Sign-Type-Unit

<<>>
<<E1, E2, ..., En>>

Ei = Value |
    Value:Size|
    Value/TypeSpecifierList |
    Value:Size/TypeSpecifierList

@type End = big|little|native

@type Sign = signed | unsigned

@type Type = integer | float | binary

@type Unit = 1|2|...255

End 跟機器有關，預設值為 big，這是 endianess，當資料是 16#12345678，如果是 little-endian，要寫到從0x0000開始的記憶體位址時，就存為 16#78 16#56 16#34 16#12，如果是big-endian，在記憶體中就存為 16#12 16#34 16#56 16#78，最高位元組在位址最低位元。native 則表示是在執行時才由 CPU 決定。

以目前常見的CPU為例：INTEL X86、DEC VAX 使用 LITTLE-ENDIAN 設計；HP、IBM、MOTOROLA 68K 系列使用 BIG-ENDIAN 設計；POWERPC 同時支援兩種格式，稱為 BI-ENDIAN。

Sign 只用在 pattern matching，預設值為 unsigned
Type 預設值為 integer
Unit 的值由 Type 決定，如果 Type 是 integer 或 float，Unit 為 1，如果 Type 是 binary，Unit 則為 8。Size * Unit 的結果，就是整個 binary 的體積，總體積必須要是 8 的倍數。

27> {<<16#12345678:32/big>>, <<16#12345678:32/little>>, <<16#12345678:32/native>>, <<16#12345678:32>>}.
{<<18,52,86,120>>,
 <<120,86,52,18>>,
 <<120,86,52,18>>,
 <<18,52,86,120>>}

從範例可看出，這台機器是使用 little-endian。

參考

Erlang and OTP in Action
Programming Erlang: Software for a Concurrent World

cctg

2014/2/26

erlang basics - binary, bitstring, BIF