xref: /openbmc/bmcweb/COMMON_ERRORS.md (revision 08bbe119)
1# Commonly recurring errors in bmcweb
2
3What follows is a list of common errors that new users to bmcweb tend to make
4when operating within its bounds for the first time. If this is your first time
5developing in bmcweb, the maintainers highly recommend reading and understanding
6_all_ of common traps before continuing with any development. Every single one
7of the examples below compile without warnings, but are incorrect in
8not-always-obvious ways, or impose a pattern that tends to cause hard to find
9bugs, or bugs that appear later. Every one has been submitted to code review
10multiple times.
11
12## 1. Directly dereferencing a pointer without checking for validity first
13
14```cpp
15int myBadMethod(const nlohmann::json& j){
16    const int* myPtr = j.get_if<int>();
17    return *myPtr;
18}
19```
20
21This pointer is not guaranteed to be filled, and could be a null dereference.
22
23## 2. String views aren't null terminated
24
25```cpp
26int getIntFromString(std::string_view s){
27    return std::atoi(s.data());
28}
29```
30
31This will give the right answer much of the time, but has the possibility to
32fail when `string_view` is not null terminated. Use `from_chars` instead, which
33takes both a pointer and a length
34
35## 3. Not handling input errors
36
37```cpp
38int getIntFromString(const std::string& s){
39    return std::atoi(s.c_str());
40}
41```
42
43In the case where the string is not representable as an int, this will trigger
44undefined behavior at system level. Code needs to check for validity of the
45string, ideally with something like `from_chars`, and return the appropriate
46error code.
47
48## 4. Walking off the end of a string
49
50```cpp
51std::string getFilenameFromPath(const std::string& path){
52    size_t index = path.find("/");
53    if (index != std::string::npos){
54        // If the string ends with "/", this will walk off the end of the string.
55        return path.substr(pos + 1);
56    }
57    return "";
58}
59```
60
61## 5. Using methods that throw (or not handling bad inputs)
62
63```cpp
64int myBadMethod(nlohmann::json& j){
65    return j.get<int>();
66}
67```
68
69This method throws, and bad inputs will not be handled
70
71Commonly used methods that fall into this pattern:
72
73- std::variant::get
74- std::vector::at
75- std::map::at
76- std::set::at
77- std::\<generic container type\>::at
78- nlohmann::json::operator!=
79- nlohmann::json::operator+=
80- nlohmann::json::at
81- nlohmann::json::get
82- nlohmann::json::get_ref
83- nlohmann::json::get_to
84- nlohmann::json::operator\<\<
85- nlohmann::json::operator\>\>
86- std::filesystem::create_directory
87- std::filesystem::rename
88- std::filesystem::file_size
89- std::stoi
90- std::stol
91- std::stoll
92
93### Special note: JSON
94
95`nlohmann::json::parse` by default
96[throws](https://json.nlohmann.me/api/basic_json/parse/) on failure, but also
97accepts an optional argument that causes it to not throw: set the 3rd argument
98to `false`.
99
100`nlohmann::json::dump` by default
101[throws](https://json.nlohmann.me/api/basic_json/dump/) on failure, but also
102accepts an optional argument that causes it to not throw: set the 4th argument
103to `replace`. Although `ignore` preserves content 1:1, `replace` is preferred
104from a security point of view.
105
106### Special note: Boost
107
108there is a whole class of boost asio functions that provide both a method that
109throws on failure, and a method that accepts and returns an error code. This is
110not a complete list, but users should verify in the boost docs when calling into
111asio methods, and prefer the one that returns an error code instead of throwing.
112
113- boost::asio::ip::tcp::acceptor::bind();
114- boost::asio::ip::tcp::acceptor::cancel();
115- boost::asio::ip::tcp::acceptor::close();
116- boost::asio::ip::tcp::acceptor::listen();
117- boost::asio::ip::address::make_address();
118
119## 6. Blocking functions
120
121bmcweb uses a single reactor for all operations. Blocking that reactor for any
122amount of time causes all other operations to stop. The common blocking
123functions that tend to be called incorrectly are:
124
125- sleep()
126- boost::asio::ip::tcp::socket::read()
127- boost::asio::ip::tcp::socket::read_some()
128- boost::asio::ip::tcp::socket::write()
129- boost::asio::ip::tcp::socket::write_some()
130- boost::asio::ip::tcp::socket::connect()
131- boost::asio::ip::tcp::socket::send()
132- boost::asio::ip::tcp::socket::wait()
133- boost::asio::steady_timer::wait()
134
135Note: an exception is made for filesystem/disk IO read and write. This is mostly
136due to not having great abstractions for it that mate well with the async
137system, the fact that most filesystem accesses are into tmpfs (and therefore
138should be "fast" most of the time) and in general how little the filesystem is
139used in practice.
140
141## 7. Lack of locking between subsequent calls
142
143While global data structures are discouraged, they are sometimes required to
144store temporary state for operations that require it. Given the single threaded
145nature of bmcweb, they are not required to be explicitly threadsafe, but they
146must be always left in a valid state, and checked for other uses before
147occupying.
148
149```cpp
150std::optional<std::string> currentOperation;
151void firstCallbackInFlow(){
152    currentOperation = "Foo";
153}
154void secondCallbackInFlow(){
155    currentOperation.reset();
156}
157```
158
159In the above case, the first callback needs a check to ensure that
160currentOperation is not already being used.
161
162## 8. Wildcard reference captures in lambdas
163
164```cpp
165std::string x; auto mylambda = [&](){
166    x = "foo";
167}
168do_async_read(mylambda)
169```
170
171Numerous times, lifetime issues of const references have been injected into
172async bmcweb code. While capturing by reference can be useful, given how
173difficult these types of bugs are to triage, bmcweb explicitly requires that all
174code captures variables by name explicitly, and calls out each variable being
175captured by value or by reference. The above prototypes would change to
176`[&x]()...` Which makes clear that x is captured, and its lifetime needs
177tracked.
178
179## 9. URLs should end in "/"
180
181```cpp
182BMCWEB("/foo/bar");
183```
184
185Unless you explicitly have a reason not to (as there is one known exception
186where the behavior must differ) all URL handlers should end in "/". The bmcweb
187route handler will detect routes ending in slash and generate routes for both
188the route ending in slash and the one without. This allows both URLs to be used
189by users. While many specifications do not require this, it resolves a whole
190class of bug that we've seen in the past.
191
192## 10. URLs constructed in aggregate
193
194```cpp
195std::string routeStart = "/redfish/v1";
196
197BMCWEB_ROUTE(routestart + "/SessionService/")
198```
199
200Very commonly, bmcweb maintainers and contributors alike have to do audits of
201all routes that are available, to verify things like security and documentation
202accuracy. While these processes are largely manual, they can mostly be conducted
203by a simple grep statement to search for urls in question. Doing the above makes
204the route handlers no longer greppable, and complicates bmcweb patchsets as a
205whole.
206
207## 11. Not responding to 404
208
209```cpp
210BMCWEB_ROUTE("/myendpoint/<str>",
211    [](Request& req, Response& res, const std::string& id){
212     crow::connections::systemBus->async_method_call(
213          [asyncResp](const boost::system::error_code& ec,
214                      const std::string& myProperty) {
215              if (ec)
216              {
217                  messages::internalError(asyncResp->res);
218                  return;
219              }
220              ... handle code
221          },
222          "xyz.openbmc_project.Logging",
223          "/xyz/openbmc_project/mypath/" + id,
224          "xyz.MyInterface", "GetAll", "");
225});
226```
227
228All bmcweb routes should handle 404 (not found) properly, and return it where
229appropriate. 500 internal error is not a substitute for this, and should be only
230used if there isn't a more appropriate error code that can be returned. This is
231important, because a number of vulnerability scanners attempt injection attacks
232in the form of `/myendpoint/foobar`, or `/myendpoint/#$*(%)&#%$)(\*&` in an
233attempt to circumvent security. If the server returns 500 to any of these
234requests, the security scanner logs it as an error for followup. While in
235general these errors are benign, and not actually a real security threat, having
236a clean security run allows maintainers to minimize the amount of time spent
237triaging issues reported from these scanning tools.
238
239An implementation of the above that handles 404 would look like:
240
241```cpp
242BMCWEB_ROUTE("/myendpoint/<str>",
243    [](Request& req, Response& res, const std::string& id){
244     crow::connections::systemBus->async_method_call(
245          [asyncResp](const boost::system::error_code& ec,
246                      const std::string& myProperty) {
247              if (ec == <error code that gets returned by not found>){
248                  messages::resourceNotFound(res);
249                  return;
250              }
251              if (ec)
252              {
253                  messages::internalError(asyncResp->res);
254                  return;
255              }
256              ... handle code
257          },
258          "xyz.openbmc_project.Logging",
259          "/xyz/openbmc_project/mypath/" + id,
260          "xyz.MyInterface", "GetAll", "");
261});
262```
263
264Note: A more general form of this rule is that no handler should ever return 500
265on a working system, and any cases where 500 is found, can immediately be
266assumed to be
267[a bug in either the system, or bmcweb.](https://github.com/openbmc/bmcweb/blob/master/DEVELOPING.md#error-handling)
268
269## 12. Imprecise matching
270
271```cpp
272void isInventoryPath(const std::string& path){
273    if (path.find("inventory")){
274        return true;
275    }
276    return false;
277}
278```
279
280When matching dbus paths, HTTP fields, interface names, care should be taken to
281avoid doing direct string containment matching. Doing so can lead to errors
282where fan1 and fan11 both report to the same object, and cause behavior breaks
283in subtle ways.
284
285When using dbus paths, rely on the methods on `sdbusplus::message::object_path`.
286When parsing HTTP field and lists, use the RFC7230 implementations from
287boost::beast.
288
289Other commonly misused methods are: boost::iequals. Unless the standard you're
290implementing (as is the case in some HTTP fields) requires case insensitive
291comparisons, casing should be obeyed, especially when relying on user-driven
292data.
293
294- boost::starts_with
295- boost::ends_with
296- std::string::starts_with
297- std::string::ends_with
298- std::string::rfind
299
300The above methods tend to be misused to accept user data and parse various
301fields from it. In practice, there tends to be better, purpose built methods for
302removing just the field you need.
303
304## 13. Complete replacement of the response object
305
306```cpp
307void getMembers(crow::Response& res){
308  res.jsonValue = {{"Value", 2}};
309}
310```
311
312In many cases, bmcweb is doing multiple async actions in parallel, and
313orthogonal keys within the Response object might be filled in from another task.
314Completely replacing the json object can lead to convoluted situations where the
315output of the response is dependent on the _order_ of the asynchronous actions
316completing, which cannot be guaranteed, and has many time caused bugs.
317
318```cpp
319void getMembers(crow::Response& res){
320  res.jsonValue["Value"] = 2;
321}
322```
323
324As an added benefit, this code is also more efficient, as it avoids the
325intermediate object construction and the move, and as a result, produces smaller
326binaries.
327
328Note, another form of this error involves calling nlohmann::json::reset(), to
329clear an object that's already been filled in. This has the potential to clear
330correct data that was already filled in from other sources.
331