xref: /openbmc/bmcweb/COMMON_ERRORS.md (revision 81d523a7)
1# Commonly recurring errors in bmcweb
2
3What follows is a list of common errors that new users to bmcweb tend to make
4when operating within its bounds for the first time.  If this is your first time
5developing in bmcweb, the maintainers highly recommend reading and understanding
6_all_ of common traps before continuing with any development.  Every single one
7of the examples below compile without warnings, but are incorrect in
8not-always-obvious ways, or impose a pattern that tends to cause hard to find
9bugs, or bugs that appear later.  Every one has been submitted to code review
10multiple times.
11
12### 1. Directly dereferencing a pointer without checking for validity first
13```C++
14int myBadMethod(const nlohmann::json& j){
15    const int* myPtr = j.get_if<int>();
16    return *myPtr;
17}
18```
19This pointer is not guaranteed to be filled, and could be a null dereference.
20
21### 2. String views aren't null terminated
22```C++
23int getIntFromString(const std::string_view s){
24    return std::atoi(s.data());
25}
26```
27This will give the right answer much of the time, but has the possibility to
28fail when string\_view is not null terminated.  Use from\_chars instead, which
29takes both a pointer and a length
30
31### 3. Not handling input errors
32```C++
33int getIntFromString(const std::string& s){
34    return std::atoi(s.c_str());
35}
36```
37In the case where the string is not representable as an int, this will trigger
38undefined behavior at system level.  Code needs to check for validity of the
39string, ideally with something like from\_chars, and return the appropriate error
40code.
41
42### 4. Walking off the end of a string
43```C++
44std::string getFilenameFromPath(const std::string& path){
45    size_t index = path.find("/");
46    if (index != std::string::npos){
47        // If the string ends with "/", this will walk off the end of the string.
48        return path.substr(pos + 1);
49    }
50    return "";
51}
52```
53
54### 5. Using methods that throw (or not handling bad inputs)
55```C++
56int myBadMethod(nlohmann::json& j){
57    return j.get<int>();
58}
59```
60This method throws, and bad inputs will not be handled
61
62Commonly used methods that fall into this pattern:
63
64- std::variant::get
65- std::vector::at
66- std::map::at
67- std::set::at
68- std::\<generic container type\>::at
69- nlohmann::json::operator!=
70- nlohmann::json::operator+=
71- nlohmann::json::at
72- nlohmann::json::get
73- nlohmann::json::get\_ref
74- nlohmann::json::get\_to
75- nlohmann::json::operator\<\<
76- nlohmann::json::operator\>\>
77- std::filesystem::create\_directory
78- std::filesystem::rename
79- std::filesystem::file\_size
80- std::stoi
81- std::stol
82- std::stoll
83
84#### Special note: JSON
85
86`nlohmann::json::parse` by default
87[throws](https://json.nlohmann.me/api/basic_json/parse/) on failure, but
88also accepts an optional argument that causes it to not throw: set the 3rd
89argument to `false`.
90
91`nlohmann::json::dump` by default
92[throws](https://json.nlohmann.me/api/basic_json/dump/) on failure, but also
93accepts an optional argument that causes it to not throw: set the 4th
94argument to `replace`.  Although `ignore` preserves content 1:1, `replace`
95is preferred from a security point of view.
96
97#### Special note: Boost
98there is a whole class of boost asio functions that provide both a method that
99throws on failure, and a method that accepts and returns an error code.  This is
100not a complete list, but users should verify in the boost docs when calling into
101asio methods, and prefer the one that returns an error code instead of throwing.
102
103- boost::asio::ip::tcp::acceptor::bind();
104- boost::asio::ip::tcp::acceptor::cancel();
105- boost::asio::ip::tcp::acceptor::close();
106- boost::asio::ip::tcp::acceptor::listen();
107- boost::asio::ip::address::make\_address();
108
109### 6. Blocking functions
110
111bmcweb uses a single reactor for all operations.  Blocking that reactor for any
112amount of time causes all other operations to stop.  The common blocking
113functions that tend to be called incorrectly are:
114
115- sleep()
116- boost::asio::ip::tcp::socket::read()
117- boost::asio::ip::tcp::socket::read\_some()
118- boost::asio::ip::tcp::socket::write()
119- boost::asio::ip::tcp::socket::write\_some()
120- boost::asio::ip::tcp::socket::connect()
121- boost::asio::ip::tcp::socket::send()
122- boost::asio::ip::tcp::socket::wait()
123- boost::asio::steady\_timer::wait()
124
125Note: an exception is made for filesystem/disk IO read and write.  This is
126mostly due to not having great abstractions for it that mate well with the async
127system, the fact that most filesystem accesses are into tmpfs (and therefore
128should be "fast" most of the time) and in general how little the filesystem is
129used in practice.
130
131### 7. Lack of locking between subsequent calls
132While global data structures are discouraged, they are sometimes required to
133store temporary state for operations that require it.  Given the single
134threaded nature of bmcweb, they are not required to be explicitly threadsafe,
135but they must be always left in a valid state, and checked for other uses
136before occupying.
137
138```C++
139std::optional<std::string> currentOperation;
140void firstCallbackInFlow(){
141    currentOperation = "Foo";
142}
143void secondCallbackInFlow(){
144    currentOperation.reset();
145}
146```
147
148In the above case, the first callback needs a check to ensure that
149currentOperation is not already being used.
150
151### 8. Wildcard reference captures in lambdas
152```
153std::string x; auto mylambda = [&](){
154    x = "foo";
155}
156do_async_read(mylambda)
157```
158
159Numerous times, lifetime issues of const references have been injected into
160async bmcweb code.  While capturing by reference can be useful, given how
161difficult these types of bugs are to triage, bmcweb explicitly requires that all
162code captures variables by name explicitly, and calls out each variable being
163captured by value or by reference.  The above prototypes would change to
164[&x]()... Which makes clear that x is captured, and its lifetime needs tracked.
165
166
167### 9. URLs should end in "/"
168```C++
169BMCWEB("/foo/bar");
170```
171Unless you explicitly have a reason not to (as there is one known exception
172where the behavior must differ) all URL handlers should end in "/".  The bmcweb
173route handler will detect routes ending in slash and generate routes for both
174the route ending in slash and the one without.  This allows both URLs to be
175used by users.  While many specifications do not require this, it resolves a
176whole class of bug that we've seen in the past.
177
178
179### 10. URLs constructed in aggregate
180```C++
181std::string routeStart = "/redfish/v1";
182
183BMCWEB_ROUTE(routestart + "/SessionService/")
184```
185Very commonly, bmcweb maintainers and contributors alike have to do audits of
186all routes that are available, to verify things like security and documentation
187accuracy.  While these processes are largely manual, they can mostly be
188conducted by a simple grep statement to search for urls in question.  Doing the
189above makes the route handlers no longer greppable, and complicates bmcweb
190patchsets as a whole.
191
192### 11. Not responding to 404
193```C++
194BMCWEB_ROUTE("/myendpoint/<str>",
195    [](Request& req, Response& res, const std::string& id){
196     crow::connections::systemBus->async_method_call(
197          [asyncResp](const boost::system::error_code ec,
198                      const std::string& myProperty) {
199              if (ec)
200              {
201                  messages::internalError(asyncResp->res);
202                  return;
203              }
204              ... handle code
205          },
206          "xyz.openbmc_project.Logging",
207          "/xyz/openbmc_project/mypath/" + id,
208          "xyz.MyInterface", "GetAll", "");
209});
210```
211All bmcweb routes should handle 404 (not found) properly, and return it where
212appropriate.  500 internal error is not a substitute for this, and should be
213only used if there isn't a more appropriate error code that can be returned.
214This is important, because a number of vulnerability scanners attempt injection
215attacks in the form of /myendpoint/foobar, or /myendpoint/#$*(%)&#%$)(*&  in an
216attempt to circumvent security.  If the server returns 500 to any of these
217requests, the security scanner logs it as an error for followup.  While in
218general these errors are benign, and not actually a real security threat, having
219a clean security run allows maintainers to minimize the amount of time spent
220triaging issues reported from these scanning tools.
221
222An implementation of the above that handles 404 would look like:
223```C++
224BMCWEB_ROUTE("/myendpoint/<str>",
225    [](Request& req, Response& res, const std::string& id){
226     crow::connections::systemBus->async_method_call(
227          [asyncResp](const boost::system::error_code ec,
228                      const std::string& myProperty) {
229              if (ec == <error code that gets returned by not found>){
230                  messages::resourceNotFound(res);
231                  return;
232              }
233              if (ec)
234              {
235                  messages::internalError(asyncResp->res);
236                  return;
237              }
238              ... handle code
239          },
240          "xyz.openbmc_project.Logging",
241          "/xyz/openbmc_project/mypath/" + id,
242          "xyz.MyInterface", "GetAll", "");
243});
244```
245
246Note: A more general form of this rule is that no handler should ever return 500
247on a working system, and any cases where 500 is found, can immediately be
248assumed to be [a bug in either the system, or bmcweb.](https://github.com/openbmc/bmcweb/blob/master/DEVELOPING.md#error-handling)
249
250### 12. Imprecise matching
251```C++
252void isInventoryPath(const std::string& path){
253    if (path.find("inventory")){
254        return true;
255    }
256    return false;
257}
258```
259When matching dbus paths, HTTP fields, interface names, care should be taken to
260avoid doing direct string containment matching.  Doing so can lead to errors
261where fan1 and fan11 both report to the same object, and cause behavior breaks
262in subtle ways.
263
264When using dbus paths, rely on the methods on sdbusplus::message::object\_path.
265When parsing HTTP field and lists, use the RFC7230 implementations from
266boost::beast.
267
268Other commonly misused methods are:
269boost::iequals.  Unless the standard you're implementing (as is the case in some
270HTTP fields) requires case insensitive comparisons, casing should be obeyed,
271especially when relying on user-driven data.
272
273- boost::starts\_with
274- boost::ends\_with
275- std::string::starts\_with
276- std::string::ends\_with
277- std::string::rfind
278
279The above methods tend to be misused to accept user data and parse various
280fields from it.  In practice, there tends to be better, purpose built methods
281for removing just the field you need.
282
283### 13. Complete replacement of the response object
284
285```
286void getMembers(crow::Response& res){
287  res.jsonValue = {{"Value", 2}};
288}
289```
290
291In many cases, bmcweb is doing multiple async actions in parallel, and
292orthogonal keys within the Response object might be filled in from another
293task.  Completely replacing the json object can lead to convoluted situations
294where the output of the response is dependent on the _order_ of the asynchronous
295actions completing, which cannot be guaranteed, and has many time caused bugs.
296
297```
298void getMembers(crow::Response& res){
299  res.jsonValue["Value"] = 2;
300}
301```
302
303As an added benefit, this code is also more efficient, as it avoids the
304intermediate object construction and the move, and as a result, produces smaller
305binaries.
306
307Note, another form of this error involves calling nlohmann::json::reset(), to
308clear an object that's already been filled in.  This has the potential to clear
309correct data that was already filled in from other sources.
310
311