寫一個parser簡單的parse input,可以利用stringstream簡單做tokenizer
std::string expression = "(2+3)*4"; std::stringstream ss(expression); while(ss) { .... }
上面這個寫法看起來沒問題,檢查stringstream的state,如果fail就跳出迴圈,但其實這個寫法並不完全安全
我們再來看另一個範例
#include <sstream> #include <iostream> int main() { std::string s = "123456"; std::stringstream ss(s); while(ss) { char c = ss.peek(); std::cout << c << ", " << (int)c << " , EOF = " << ss.eof() << std::endl; ss.get(); } return 0; }
一般可能會預期就是印出1, 2, 3, 4, 5, 6的ascii int,但其實還多了一個-1 ,也就是說,在讀完6之後ss 的state還不是eof,等到做了peek()操作後,eof bit就會set了
因此在處理stringstream讀取時,此部分要特別小心,不能假設ss valid代表後面的讀操作就 會正確,還需要在讀操作後做一些檢查
以peek來說,C++11標準中描述的行為
int_type peek();
Effects: Behaves as an unformatted input function (as described in 27.7.2.3, paragraph 1). After constructing a sentry object, reads but does not extract the current input character.
C++11 27.7.2.3
Returns: traits::eof() if good() is false. Otherwise, returns rdbuf()->sgetc().
而sgetc()的行為則是在C++11 27.7.2.1裡有描述
If rdbuf()->sbumpc() or rdbuf()->sgetc() returns traits::eof(), then the input function, except as explicitly noted otherwise, completes its actions and does setstate(eofbit), which may throw ios_- base::failure (27.5.5.4), before returning.
C++11 27.7.2.1
亦即 peek()本身會透過 sgetc()觸發eof bit set
網路上有一篇討論也值得參考,不過需注意的是該篇時間比較久,所以有些資訊的描述不一定跟上較新的標準
https://comp.lang.cpp.moderated.narkive.com/vwstw4Un/std-stringstream-and-eof-strangeness