xref: /openbmc/bmcweb/docs/COMMON_ERRORS.md (revision 76c2ad64440e2864a0cc3d66a3b8bb0f5ac2f288)
1# Commonly recurring errors in bmcweb
2
3What follows is a list of common errors that new users to bmcweb tend to make
4when operating within its bounds for the first time. If this is your first time
5developing in bmcweb, the maintainers highly recommend reading and understanding
6_all_ of common traps before continuing with any development. Every single one
7of the examples below compile without warnings, but are incorrect in
8not-always-obvious ways, or impose a pattern that tends to cause hard to find
9bugs, or bugs that appear later. Every one has been submitted to code review
10multiple times.
11
12## 1. Directly dereferencing a pointer without checking for validity first
13
14```cpp
15int myBadMethod(const nlohmann::json& j){
16    const int* myPtr = j.get_if<int>();
17    return *myPtr;
18}
19```
20
21This pointer is not guaranteed to be filled, and could be a null dereference.
22
23## 2. String views aren't null terminated
24
25```cpp
26int getIntFromString(std::string_view s){
27    return std::atoi(s.data());
28}
29```
30
31This will give the right answer much of the time, but has the possibility to
32fail when `string_view` is not null terminated. Use `from_chars` instead, which
33takes both a pointer and a length
34
35## 3. Not handling input errors
36
37```cpp
38int getIntFromString(const std::string& s){
39    return std::atoi(s.c_str());
40}
41```
42
43In the case where the string is not representable as an int, this will trigger
44undefined behavior at system level. Code needs to check for validity of the
45string, ideally with something like `from_chars`, and return the appropriate
46error code.
47
48## 4. Walking off the end of a string
49
50```cpp
51std::string getFilenameFromPath(const std::string& path){
52    size_t index = path.find("/");
53    if (index != std::string::npos){
54        // If the string ends with "/", this will walk off the end of the string.
55        return path.substr(pos + 1);
56    }
57    return "";
58}
59```
60
61## 5. Using methods that throw (or not handling bad inputs)
62
63```cpp
64int myBadMethod(nlohmann::json& j){
65    return j.get<int>();
66}
67```
68
69This method throws, and bad inputs will not be handled
70
71Commonly used methods that fall into this pattern:
72
73- std::variant::get
74- std::vector::at
75- std::map::at
76- std::set::at
77- std::\<generic container type\>::at
78- nlohmann::json::operator!=
79- nlohmann::json::operator+=
80- nlohmann::json::at
81- nlohmann::json::get
82- nlohmann::json::get_ref
83- nlohmann::json::get_to
84- nlohmann::json::items
85- nlohmann::json::operator\<\<
86- nlohmann::json::operator\>\>
87- nlohmann::json::begin
88- nlohmann::json::operator[]
89- std::filesystem::create_directory
90- std::filesystem::rename
91- std::filesystem::file_size
92- std::stoi
93- std::stol
94- std::stoll
95
96### Special note: JSON
97
98`nlohmann::json::parse` by default
99[throws](https://json.nlohmann.me/api/basic_json/parse/) on failure, but also
100accepts an optional argument that causes it to not throw: set the 3rd argument
101to `false`.
102
103`nlohmann::json::dump` by default
104[throws](https://json.nlohmann.me/api/basic_json/dump/) on failure, but also
105accepts an optional argument that causes it to not throw: set the 4th argument
106to `replace`. Although `ignore` preserves content 1:1, `replace` is preferred
107from a security point of view.
108
109The nlohmann::json class represents ANY json object, but unfortunately includes
110a number of helper functions that allow it to "act" like it only represents an
111json dict. In the majority of uses, these will not cause a problem, but if a
112type that isn't an object (0.0, "foo", or null) is already present in the
113object, these methods will
114[throw an exception](https://json.nlohmann.me/api/basic_json/operator%5B%5D/#exceptions),
115which generally goes uncaught, and can lead to potential for denial of service
116on bad input. Care should be taken when operating with a nlohmann::json object,
117and should generally prefer to use an nlohmann::json::object_t or
118nlohmann::json::array_t where appropriate. There is quite a large amount of code
119in bmcweb that does not handle the distinction between nlohmann::json and
120nlohmann::json::object_t correctly. These will be corrected over time.
121
122### Special note: Boost
123
124there is a whole class of boost asio functions that provide both a method that
125throws on failure, and a method that accepts and returns an error code. This is
126not a complete list, but users should verify in the boost docs when calling into
127asio methods, and prefer the one that returns an error code instead of throwing.
128
129- boost::asio::ip::tcp::acceptor::bind();
130- boost::asio::ip::tcp::acceptor::cancel();
131- boost::asio::ip::tcp::acceptor::close();
132- boost::asio::ip::tcp::acceptor::listen();
133- boost::asio::ip::address::make_address();
134
135## 6. Blocking functions
136
137bmcweb uses a single reactor for all operations. Blocking that reactor for any
138amount of time causes all other operations to stop. The common blocking
139functions that tend to be called incorrectly are:
140
141- sleep()
142- boost::asio::ip::tcp::socket::read()
143- boost::asio::ip::tcp::socket::read_some()
144- boost::asio::ip::tcp::socket::write()
145- boost::asio::ip::tcp::socket::write_some()
146- boost::asio::ip::tcp::socket::connect()
147- boost::asio::ip::tcp::socket::send()
148- boost::asio::ip::tcp::socket::wait()
149- boost::asio::steady_timer::wait()
150
151Note: an exception is made for filesystem/disk IO read and write. This is mostly
152due to not having great abstractions for it that mate well with the async
153system, the fact that most filesystem accesses are into tmpfs (and therefore
154should be "fast" most of the time) and in general how little the filesystem is
155used in practice.
156
157## 7. Lack of locking between subsequent calls
158
159While global data structures are discouraged, they are sometimes required to
160store temporary state for operations that require it. Given the single threaded
161nature of bmcweb, they are not required to be explicitly threadsafe, but they
162must be always left in a valid state, and checked for other uses before
163occupying.
164
165```cpp
166std::optional<std::string> currentOperation;
167void firstCallbackInFlow(){
168    currentOperation = "Foo";
169}
170void secondCallbackInFlow(){
171    currentOperation.reset();
172}
173```
174
175In the above case, the first callback needs a check to ensure that
176currentOperation is not already being used.
177
178## 8. Wildcard reference captures in lambdas
179
180```cpp
181std::string x; auto mylambda = [&](){
182    x = "foo";
183}
184do_async_read(mylambda)
185```
186
187Numerous times, lifetime issues of const references have been injected into
188async bmcweb code. While capturing by reference can be useful, given how
189difficult these types of bugs are to triage, bmcweb explicitly requires that all
190code captures variables by name explicitly, and calls out each variable being
191captured by value or by reference. The above prototypes would change to
192`[&x]()...` Which makes clear that x is captured, and its lifetime needs
193tracked.
194
195## 9. URLs should end in "/"
196
197```cpp
198BMCWEB("/foo/bar");
199```
200
201Unless you explicitly have a reason not to (as there is one known exception
202where the behavior must differ) all URL handlers should end in "/". The bmcweb
203route handler will detect routes ending in slash and generate routes for both
204the route ending in slash and the one without. This allows both URLs to be used
205by users. While many specifications do not require this, it resolves a whole
206class of bug that we've seen in the past.
207
208Note, unit tests can now find this for redfish routes.
209
210## 10. URLs constructed in aggregate
211
212```cpp
213std::string routeStart = "/redfish/v1";
214
215BMCWEB_ROUTE(routestart + "/SessionService/")
216```
217
218Very commonly, bmcweb maintainers and contributors alike have to do audits of
219all routes that are available, to verify things like security and documentation
220accuracy. While these processes are largely manual, they can mostly be conducted
221by a simple grep statement to search for urls in question. Doing the above makes
222the route handlers no longer greppable, and complicates bmcweb patchsets as a
223whole.
224
225## 11. Not responding to 404
226
227```cpp
228BMCWEB_ROUTE("/myendpoint/<str>",
229    [](Request& req, Response& res, const std::string& id){
230     dbus::utility::async_method_call(
231          [asyncResp](const boost::system::error_code& ec,
232                      const std::string& myProperty) {
233              if (ec)
234              {
235                  messages::internalError(asyncResp->res);
236                  return;
237              }
238              ... handle code
239          },
240          "xyz.openbmc_project.Logging",
241          "/xyz/openbmc_project/mypath/" + id,
242          "xyz.MyInterface", "GetAll", "");
243});
244```
245
246All bmcweb routes should handle 404 (not found) properly, and return it where
247appropriate. 500 internal error is not a substitute for this, and should be only
248used if there isn't a more appropriate error code that can be returned. This is
249important, because a number of vulnerability scanners attempt injection attacks
250in the form of `/myendpoint/foobar`, or `/myendpoint/#$*(%)&#%$)(\*&` in an
251attempt to circumvent security. If the server returns 500 to any of these
252requests, the security scanner logs it as an error for followup. While in
253general these errors are benign, and not actually a real security threat, having
254a clean security run allows maintainers to minimize the amount of time spent
255triaging issues reported from these scanning tools.
256
257An implementation of the above that handles 404 would look like:
258
259```cpp
260BMCWEB_ROUTE("/myendpoint/<str>",
261    [](Request& req, Response& res, const std::string& id){
262     dbus::utility::async_method_call(
263          [asyncResp](const boost::system::error_code& ec,
264                      const std::string& myProperty) {
265              if (ec == <error code that gets returned by not found>){
266                  messages::resourceNotFound(res);
267                  return;
268              }
269              if (ec)
270              {
271                  messages::internalError(asyncResp->res);
272                  return;
273              }
274              ... handle code
275          },
276          "xyz.openbmc_project.Logging",
277          "/xyz/openbmc_project/mypath/" + id,
278          "xyz.MyInterface", "GetAll", "");
279});
280```
281
282Note: A more general form of this rule is that no handler should ever return 500
283on a working system, and any cases where 500 is found, can immediately be
284assumed to be
285[a bug in either the system, or bmcweb.](https://github.com/openbmc/bmcweb/blob/master/DEVELOPING.md#error-handling)
286
287## 12. Imprecise matching
288
289```cpp
290void isInventoryPath(const std::string& path){
291    if (path.find("inventory")){
292        return true;
293    }
294    return false;
295}
296```
297
298When matching dbus paths, HTTP fields, interface names, care should be taken to
299avoid doing direct string containment matching. Doing so can lead to errors
300where fan1 and fan11 both report to the same object, and cause behavior breaks
301in subtle ways.
302
303When using dbus paths, rely on the methods on `sdbusplus::message::object_path`.
304When parsing HTTP field and lists, use the RFC7230 implementations from
305boost::beast.
306
307Other commonly misused methods are: boost::iequals. Unless the standard you're
308implementing (as is the case in some HTTP fields) requires case insensitive
309comparisons, casing should be obeyed, especially when relying on user-driven
310data.
311
312- boost::starts_with
313- boost::ends_with
314- std::string::starts_with
315- std::string::ends_with
316- std::string::rfind
317
318The above methods tend to be misused to accept user data and parse various
319fields from it. In practice, there tends to be better, purpose built methods for
320removing just the field you need.
321
322## 13. Complete replacement of the response object
323
324```cpp
325void getMembers(crow::Response& res){
326  res.jsonValue = {{"Value", 2}};
327}
328```
329
330In many cases, bmcweb is doing multiple async actions in parallel, and
331orthogonal keys within the Response object might be filled in from another task.
332Completely replacing the json object can lead to convoluted situations where the
333output of the response is dependent on the _order_ of the asynchronous actions
334completing, which cannot be guaranteed, and has many time caused bugs.
335
336```cpp
337void getMembers(crow::Response& res){
338  res.jsonValue["Value"] = 2;
339}
340```
341
342As an added benefit, this code is also more efficient, as it avoids the
343intermediate object construction and the move, and as a result, produces smaller
344binaries.
345
346Note, another form of this error involves calling nlohmann::json::reset(), to
347clear an object that's already been filled in. This has the potential to clear
348correct data that was already filled in from other sources.
349
350## 14. Very long lambda callbacks
351
352```cpp
353dbus::utility::getSubTree("/", interfaces,
354                         [asyncResp](boost::system::error_code& ec,
355                                     MapperGetSubTreeResult& res){
356                            <many lines of code>
357                         })
358```
359
360Inline lambdas, while useful in some contexts, are difficult to read, and have
361inconsistent formatting with tools like clang-format, which causes significant
362problems in review, and in applying patchsets that might have minor conflicts.
363In addition, because they are declared in a function scope, they are difficult
364to unit test, and produce log messages that are difficult to read given their
365unnamed nature.
366
367Prefer to either use std::bind_front, and a normal method to handle the return,
368or a lambda that is less than 10 lines of code to handle an error inline. In
369cases where std::bind_front cannot be used, such as in
370sdbusplus::asio::connection::async_method_call, keep the lambda length less than
37110 lines, and call the appropriate function for handling non-trivial transforms.
372
373```cpp
374void afterGetSubTree(std::shared_ptr<bmcweb::AsyncResp>& asyncResp,
375                     boost::system::error_code& ec,
376                     MapperGetSubTreeResult& res){
377   <many lines of code>
378}
379
380dbus::utility::getSubTree("/xyz/openbmc_project/inventory", interfaces,
381                         std::bind_front(afterGetSubTree, asyncResp));
382```
383
384See also the
385[Cpp Core Guidelines](https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#f11-use-an-unnamed-lambda-if-you-need-a-simple-function-object-in-one-place-only)
386for generalized guidelines on when lambdas are appropriate. The above
387recommendation is aligned with the C++ Core Guidelines.
388
389## 15. Using async_method_call where there are existing helper methods
390
391```cpp
392dbus::utility::async_method_call(
393    respHandler, "xyz.openbmc_project.ObjectMapper",
394    "/xyz/openbmc_project/object_mapper",
395    "xyz.openbmc_project.ObjectMapper", "GetSubTreePaths",
396    "/xyz/openbmc_project/inventory", 0, interfaces);
397```
398
399It's required to use D-Bus utility functions provided in the file
400include/dbus_utility.hpp instead of invoking them directly. Using the existing
401util functions will help to reduce the compilation time, increase code reuse and
402uniformity in error handling. Below are the list of existing D-Bus utility
403functions
404
405- getProperty
406- getAllProperties
407- checkDbusPathExists
408- getSubTree
409- getSubTreePaths
410- getAssociatedSubTree
411- getAssociatedSubTreePaths
412- getAssociatedSubTreeById
413- getAssociatedSubTreePathsById
414- getDbusObject
415- getAssociationEndPoints
416- getManagedObjects
417
418## 16. Using strings for DMTF schema Enums
419
420```cpp
421sensorJson["ReadingType"] = "Frequency";
422
423```
424
425Redfish Schema Enums and types are auto generated using
426scripts/generate_schema_enums.py. The generated header files contain the redfish
427enumerations which must be used for JSON response population.
428
429```cpp
430#include "generated/enums/sensor.hpp"
431sensorJson["ReadingType"] = sensor::ReadingType::Frequency;
432```
433
434## 17. Duplicated map lookups
435
436```cpp
437std::unordered_map<std::string, std::string> mymap;
438
439if (mymap.contains("mykey")){
440    std::string& myvalue = mymap["mykey"];
441}
442```
443
444While functionally correct doing the above results in more code generated, and
445duplicates the key lookup in the map. The above code also makes it more
446difficult to use const for "mymap", as operator[] is not const. Sometimes at()
447is used in place of operator[] to get around this, but also duplicates the
448lookup.
449
450As a minor consequence, the lookup key "mykey" is now duplicated, or needs to be
451loaded into a scope variable.
452
453```cpp
454std::unordered_map<std::string, std::string> mymap;
455
456auto it = mymap.find("mykey");
457if (it != mymap.end()){
458    std::string& myvalue = it->second;
459}
460```
461
462## 18. Auto lvalue for nontrivial function returns
463
464```cpp
465auto x = my_function();
466```
467
468Given this line of code, what is the type of x? Does x need to be checked for
469null before using? Does x hold a value that is making a copy?
470
471Explicitly declare the type when the function is nontrivial.
472
473```cpp
474int x = my_function();
475```
476
477Note, that exceptions to this rule are present when using templated data
478structures holding a member (map/vector/set). Given that the type is known, and
479generic data structures are well understood by both reviewers and authors,
480explicitly naming the iterator impacts both readability and can impact
481correctness in some const cases. Given that, consider the following code.
482
483```cpp
484std::map<int, int> mymap;
485
486auto it = mymap.find(0);
487```
488
489Given that the returned type is still present in the code, and the majority of
490C++ data structures follow a pattern that can be traced from the snippet above,
491auto is an improvement in readability here.
492