深淵な理由があって2014年に application/x-www-form-urlencoded のパーサーを作ることになるとして仕様を考える
基本はW3CのSPECを参考にしつつ、これまでのアプリケーションとの互換性を保つことを目標とする
- application/x-www-form-urlencoded ペイロードを "&" (U+0026) または ";" (U+003B) を使って分割する
- name-valueを格納する配列を用意
- 分割された文字列を次のように処理する
- 文字列の最初の文字が " " (U+0020) であればそれを削除
- 文字列に"="が含まれていれば、最初の"="までの文字をnameとし、残りの文字をvalueとする。最初の"="以降に文字がなければvalueは空文字。"="が文字列の最初の文字であればkeyを空文字とする。文字列に"="が含まれていない場合、文字列のすべてをnameとし、valueは空文字列とする。
- 全ての "+" (U+002B) を " " (U+0020) に入れ替える
- nameとvalueをunescapeし、配列に格納(push)する
- 配列を返す
テストデータはこんな感じになるかな
'a=b&c=d' => ["a","b","c","d"] 'a=b;c=d' => ["a","b","c","d"] 'a=1&b=2;c=3' => ["a","1","b","2","c","3"] 'a==b&c==d' => ["a","=b","c","=d"] 'a=b& c=d' => ["a","b","c","d"] 'a=b; c=d' => ["a","b","c","d"] 'a=b; c =d' => ["a","b","c ","d"] 'a=b;c= d ' => ["a","b","c"," d "] 'a=b&+c=d' => ["a","b"," c","d"] 'a=b&+c+=d' => ["a","b"," c ","d"] 'a=b&c=+d+' => ["a","b","c"," d "] 'a=b&%20c=d' => ["a","b"," c","d"] 'a=b&%20c%20=d' => ["a","b"," c ","d"] 'a=b&c=%20d%20' => ["a","b","c"," d "] 'a&c=d' => ["a","","c","d"] 'a=b&=d' => ["a","b","","d"] 'a=b&=' => ["a","b","",""] '&' => ["","","",""] '=' => ["",""] '' => []