The website uses cookies. By using this site, you agree to our use of cookies as described in the
Privacy Policy
.
I Agree
My Markups
Toggle navigation
Login
Upgrade
xiaotian zhu
50 articles
My Web Markups - xiaotian zhu
o
another distribution q(w|D) with some variational parameters θ.
P(w|D) (our posterior from above)
P(x) is the evidence
P(x|θ) which is the likelihood
P(θ) which is our prior
take a variational approximation rather than a Monte Carlo scheme to find the approximate Bayesian posterior distribution
only point estimates of the weights are achieved in the network. As a result, these networks make overconfident predictions and do not account for uncertainty in the parameters
Aleatoric uncertainty, on the other hand, is modelled by placing a distribution over the output of the model.
Epistemic uncertainty is modelled by placing a prior distribution over a model’s weights and then trying to capture how much these weights vary given some data
heteroscedastic uncertainty which depends on the inputs to the model
homoscedastic uncertainty, the uncertainty which stays constant for different inputs
Epistemic uncertainty represents the uncertainty caused by the model itself
Aleatoric uncertainty measures the noise inherent in the observations
A prior probability distribution is defined before observing the data, the learning happens and the distribution transforms into posterior distributions once the data is observed
Bayesian Neural Network Series Post 2: Background Knowledge | by Kumar Shridhar | NeuralSpace | Medium
15 annotations
medium.com
601
kumar-shridhar/Master-Thesis-BayesianCNN: Master Thesis on Bayesian Convolutional Neural Network using Variational Inference
github.com
707
by finding the parameter that maximizes the log-likelihood of the observed sample . This is the same as maximizing the likelihood function
log-likelihood function is the function defined by
a function of for fixed (i.e., for the sample we have observed), it is called likelihood (or likelihood function) and it is denoted by .
distribution of belongs to a parametric family: there is a set of real vectors (called the parameter space) whose elements (called parameters) are put into correspondence with the distributions that could have generated ;
realization of a random vector ,
e ,
a sample
the likelihood is a function that associates to each parameter the probability (or probability density) of observing the given sample
Log-likelihood
8 annotations
www.statlect.com
460
the number of ego_pose records in our loaded database is the same as the number ofsample_data records. These two records exhibit a one-to-one correspondence
ego_pose contains information about the location (encoded in translation) and the orientation (encoded in rotation) of the ego vehicle body frame
the translation and the rotation parameters are given with respect to the ego vehicle body frame
Open Datasets: nuScenes Tutorial - Scale
3 annotations
scale.com
688
Each function then returns a TensorOptions object preconfigured with that axis, but allowing even further modification via the builder-style methods shown above
The result of the conversion, float_tensor, is a new tensor pointing to new memory, unrelated to the source source_tensor.
.to(torch::kFloat32)
when there is only a single axis we would like to change compared to its default value, we can pass only that value.
Tensor Creation API — PyTorch master documentation
4 annotations
pytorch.org
476
HKUST
In utils/quant_dorefa.py, Line 46,Why does it multiply max_w ? · Issue #10 · zzzxxxttt/pytorch_DoReFaNet
1 annotation
github.com
368
CAS, interested in object detection, remote s
UCAS
Stargazers · zzzxxxttt/pytorch_simple_CenterNet_45
2 annotations
github.com
576
skew coefficient
optical center
focal length
translation, t.
rotation, R
intrinsic parameters represent a projective transformation from the 3-D camera’s coordinates into the 2-D image coordinates
extrinsic parameters represent a rigid transformation from 3-D world coordinate system to the 3-D camera’s coordinate system
camera coordinates are mapped into the image plane using the intrinsics parameters
world points are transformed to camera coordinates using the extrinsics parameters
What Is Camera Calibration? - MATLAB & Simulink - MathWorks 中国
9 annotations
ww2.mathworks.cn
428
Tensor Indexing API — PyTorch master documentation
pytorch.org
498
model parameters must be broadcast to the other GPUs at the beginning of the next iteration
gradients are reduced on the master GPU
loss value is scattered across the GPUs and each GPU runs the backward pass to compute gradients
network outputs are then gathered on the master GPU
scattering the sub mini-batches across the GPU network. Each GPU runs the forward pass on its sub mini-batch on a separate thread
data-parallel
the available parallelism on the GPU is fully utilized at batch size ~8
Distributed data parallel training using Pytorch on AWS – Telesens
7 annotations
www.telesens.co
692
SECOND
ranked based on the moderately difficult results
pedestrians and cyclists we require a 3D bounding box overlap of 50%
For cars we require an 3D bounding box overlap of 70%
objects in don't car areas do not count as false positives
The KITTI Vision Benchmark Suite
5 annotations
www.cvlibs.net
498
unpack the entire expression that contains the parameter pack. Not just the parameter pack
c++ - How does folding over comma work? - Stack Overflow
1 annotation
stackoverflow.com
545
cuda - What does nvprof output: "No kernels were profiled" mean, and how to fix it - Stack Overflow
stackoverflow.com
686
removes the constraint that observed states only depend on the hidden state in the same event
the observed states only depend on hidden state of current event
regard every event at a particluar time step as a Naive Bayes model
the value of a particular feature is independent of the value of any other feature given the class variable
From Naive Bayes to Linear-chain CRF | CN Yah
4 annotations
cnyah.com
592
Halmstad-University/SalsaNext: Uncertainty-aware Semantic Segmentation of LiDAR Point Clouds for Autonomous Driving
github.com
546
traveller59/spconv: Spatial Sparse Convolution in PyTorch
github.com
379
GG\boldsymbol{G}是对矩阵ggg各个元素取指数后的矩阵
H(yt+1|x)H(yt+1|x)H(y_{t+1}|\boldsymbol{x})是编码模型h(yt+1|x)h(yt+1|x)h(y_{t+1}|\boldsymbol{x})(RNN、CNN等)对位置t+1t+1t+1的各个标签的打分的指数
终点
指数和
计算到时刻ttt的归一化因子记为Zt
ggg实际上就是一个有限的、待训练的参数矩阵而已
只需要对每一个标签和每一个相邻标签对分别打分,然后将所有打分结果求和得到总分
关联仅发生在相邻位置
因为这个是条件分布,所以归一化因子跟xx\boldsymbol{x}有关
简明条件随机场CRF介绍(附带纯Keras实现) - 科学空间|Scientific Spaces
9 annotations
kexue.fm
635
call the operator handle with all of the arguments passed into the dispatching function.
overload name of the operator
name of the operator
dispatcher
typed operator handle
the operator that we are going to dispatch to
look up a typed operator handle from the dispatcher corresponding to the operator that we are going to dispatch to
single
The TORCH_LIBRARY_IMPL lets us register implementations for operators on a specific dispatch key
the simple way of registering it (def("myadd", myadd_cpu)) would register the kernel to run in all cases, even if the tensor is not a CPU tensor!
The dispatcher determines what the highest priority dispatch key is at the time you call an operator
first execute the Autograd kernel, and then we redispatch to the CPU or CUDA kernel depending on the device types of the passed in tensors.
Dispatcher in C++ — PyTorch Tutorials 1.6.0 documentation
12 annotations
pytorch.org
388
Parameter pack expansion: expands to comma-separated list of zero or more patterns. Pattern must include at least one parameter pack.
Parameter pack(since C++11) - cppreference.com
1 annotation
en.cppreference.com
460
it learns a representation that uses cues from other tasks, such as segmentation
our model can learn multi-task weightings and outperform separate models trained individually on each task.
use homoscedastic uncertainty to weight the losses in multi-task learning models
because it is tested on the same problem with the same sensor.
train a model with dropout. Then, at test time, rather than performing model averaging, we can stochastically sample from the network with different random dropout masks. The statistics of this distribution of outputs will reflect the model’s epistemic uncertainty
model distributions over models and their parameters
the uncertainty parameter will no longer be a model output, but a free parameter we optimise.
learned loss attenuation
model Heteroscedastic aleatoric uncertainty just by changing our loss functions
Bayesian deep learning models typically form uncertainty estimates by either placing distributions over model weights, or by learning a direct mapping to probabilistic outputs
when the model is unfamiliar with the footpath, and the corresponding increased epistemic uncertainty
aleatoric uncertainty captures object boundaries where labels are noisy
Task-dependant or Homoscedastic uncertainty is aleatoric uncertainty which is not dependant on the input data. It is not a model output, rather it is a quantity which stays constant for all input data and varies between different tasks
Data-dependant or Heteroscedastic uncertainty is aleatoric uncertainty which depends on the input data and is predicted as a model output
Aleatoric uncertainty captures our uncertainty with respect to information which our data cannot explain
model uncertainty
Epistemic uncertainty captures our ignorance about which model generated our collected data
Deep Learning Is Not Good Enough, We Need Bayesian Deep Learning for Safe AI - Home
17 annotations
alexgkendall.com
460
By thread indexing we are getting a unique number for each thread and each block in a grid
CUDA Thread Indexing. When I was learning CUDA programming, I… | by Anuradha Karunarathna | Noteworthy - The Journal Blog
1 annotation
blog.usejournal.com
429
CRFs are indeed basically the sequential version of logistic regression
each feature function fjfjf_j
how much we suspect that the current word should be labeled as an adjective
a function that takes in as input
Introduction to Conditional Random Fields
4 annotations
blog.echen.me
661
Rvalue references allow a function to branch at compile time
X& is now also called an lvalue reference
If X is any type, then X&& is called an rvalue reference to X
Page 3 of: C++ Rvalue References Explained
3 annotations
thbecker.net
392
- Added self.use_ninja = False in the __init__ function of BuildExtension class in ...\site-packages\torch\utils\cpp_extension.py, to get more informative output about where the build errors are. Instead of hacking the code like that, there might be a way to pass some argument to disable ninja build; I didn't explore that. - For "Warning: Error checking compiler version for cl: [WinError 2] The system cannot find the file specified". - In each Anaconda prompt, before "cl" is invoked, we need to run appropriate Visual Studio Developer command file to set up the environment variables. Example: "call "C:\Program Files (x86)\Microsoft Visual Studio\2019\Enterprise\VC\Auxiliary\Build\vcvars64.bat". The exact batch file to execute depends on the architecture for which we want to compile etc. - I decided not to use develop.sh, because it leads to "Python: unknown command" error due to bash not being aware of Anaconda environment (??). The first line in develop.sh is for cleaning up -- which can be done manually. I chose to directly run the second line "python setup.py develop". - Handling the <tr1/functional> file not found error - "tr1" was a temporary namespace that was in use when "functional" was still not part of the C++ standard. - In "sparseconfig.h" under "sparseconvnet\SCN\Metadata\sparsehash\internal", made the following changes: - Changed "#define HASH_FUN_H <tr1/functional>" to "#define HASH_FUN_H <functional>" - Changed "#define HASH_NAMESPACE std::tr1" to "#define HASH_NAMESPACE std" - In cpp_extension.py, line 1398, set num_workers = 1 - On line 297, self.use_ninja = False - There are several occurrences of "or" (instead of "||") and "and" (instead of "&&") in Metadata.cpp, NetworkInNetwork.cpp etc. To get them to build with Microsoft compiler, we need to pass the "/permissive-" option (Note: Though '/Za' flag is supposed to fix them, it causes other errors). Updated setup.py to add that option. - c:\blahblah\SparseConvNet\sparseconvnet\SCN\Metadata\sparsehash\internal\hashtable-common.h(166): error C2760: syntax error: unexpected token 'typedef', expected ';' - In hashtable-common.h, changed the definition of SPARSEHASH_COMPILE_ASSERT(expr, msg) to static_assert(expr, "message") - c:\blahblah\SparseConvNet\sparseconvnet\SCN\CUDA/SparseToDense.cpp(29): error C2664: 'at::Tensor &at::Tensor::resize_(c10::IntArrayRef,c10::optional<c10::MemoryFormat>) const': cannot convert argument 1 from 'std::array<long,3>' to 'c10::IntArrayRef' - Changed "std::array<long, Dimension + 2> sz" to "std::array<int64_t, Dimension + 2> sz" - c:\blahblah\SparseConvNet\sparseconvnet\SCN\CUDA/BatchNormalization.cu(83): error: calling a __host__ function("pow<float, double, (int)0> ") from a __global__ function("BatchNormalization_f_test<float, (int)16, (int)64> ") is not allowed - Changed "pow(_saveInvStd / nActive + eps, -0.5)" to "pow(double(_saveInvStd / nActive + eps), -0.5)". Otherwise, the calling signature happens to be pow(float, double) which does not correspond to the signature of any variant of "pow" function available on CUDA. - Big one: After doing all of the above, I got the code to compile, but a mysterious link error appeared. The error said something like this: sparseconvnet_cuda.obj : error LNK2001: unresolved external symbol "public: long * __cdecl at::Tensor::data_ptr<long>(void)const " (??$data_ptr@J@Tensor@at@@QEBAPEAJXZ) - This was too mysterious. A knowledgeable poster on CUDA forum offered a clue to help this. As that poster said, code meant to be cross-platform should not be using "long". It would end up being 32 bit wide on 64-bit Windows machines while being 64 bit wide on 64-bit Linux machines. - Replaced all occurrences of "long" by "int64_t" and the mysterious link error went away.
Does the sparseConvNet have windows version? · Issue #128 · facebookresearch/SparseConvNet
1 annotation
github.com
582
You can also check the instance with multiple types
Python isinstance() explained with examples [Guide]
1 annotation
pynative.com
530
when compiling a given target
The named <target> must have been created by a command such as add_executable() or add_library()
target_include_directories — CMake 3.18.2 Documentation
2 annotations
cmake.org
508
the entire GPU grid is split into as many blocks of 1 x 1024 threads as are required to fill our matrices with one thread per component
each CUDA block will have 1024 threads
If you wanted to dispatch over all types and not just floating point types (Float and Double), you can use AT_DISPATCH_ALL_TYPES.
AT_DISPATCH_FLOATING_TYPES macro
kernel launch (indicated by the <<<...>>>).
.cu
this file will also declare functions that are defined in CUDA (.cu) files
write a C++ file which defines the functions that will be called from Python
Custom C++ and CUDA Extensions — PyTorch Tutorials 1.6.0 documentation
8 annotations
pytorch.org
409
A type generator is a template whose only purpose is to synthesize a new type or types based on its template argument(s)
A traits class provides a way of associating information with a compile-time entity
Generic Programming Techniques
2 annotations
www.boost.org
593
vscode settings - How do I hide certain files from the sidebar in Visual Studio Code? - Stack Overflow
stackoverflow.com
457
module name (example)
First steps — pybind11 2.5.dev1 documentation
1 annotation
pybind11.readthedocs.io
669
只要我们观察到了某个时间反转的物体,我们就可以断定这个物体的正常版本必定在未来某个时刻进入时间反转机
(6 封私信 / 1 条消息) 《信条》讲了一个什么故事? - 知乎
1 annotation
www.zhihu.com
341
Docker
www.yolomax.com
475
判别模型是直接对 建模
(5 封私信) 如何用简单易懂的例子解释条件随机场(CRF)模型?它和HMM有什么区别? - 知乎
1 annotation
www.zhihu.com
568
the current dice label only depends on the previous one
dice is fair
current dice label
only a viable strategy when I’m equally likely to use either dice
for each roll
Conditional Random Field Tutorial in PyTorch 🔥 | by Freddy Boulton | Towards Data Science
5 annotations
towardsdatascience.com
564
The at::Tensor class in ATen is not differentiable by default. To add the differentiability of tensors the autograd API provides, you must use tensor factory functions from the torch:: namespace instead of the at:: namespace
PyTorch C++ API — PyTorch master documentation
1 annotation
pytorch.org
446
when vector goes out of scope no destructors are automatically called.
assign = "actually" copies data.
W
e created an array of objects stored by value. And no. Removing * from declaration doesn't automatically make pointer variable reference.
The new C++ 11 rvalue reference && and why you should start using it - CodeProject
4 annotations
www.codeproject.com
421
An rvalue is an expression that is not an lvalue
An lvalue is an expression that refers to a memory location and allows us to take the address of that memory location via the & operator
Rvalue references solve at least two problems: Implementing move semantics Perfect forwarding
Rvalue references are a feature of C++ that was added with the C++11 standard. What makes rvalue references a bit difficult to grasp is that when you first look at them, it is not clear what their purpose is or what problems they solve. Therefore, I will not jump right in and explain what rvalue references are. Instead, I will start with the problems that are to be solved and then show how rvalue references provide the solution. That way, the definition of rvalue references will appear plausible and natural to you. Rvalue references solve at least two problems: Implementi
C++ Rvalue References Explained
4 annotations
thbecker.net
411
templates are just a big state machine.
if we call a templated method with two different template arguments, they will actually be compiled into two totally different calls
Template Metaprogramming: Compile time loops over class methods | by Niko Savas | Medium
2 annotations
medium.com
427
cudaMemcpy
CPU上调用,GPU上执行
__global__
__global__函数的返回值必须设置为void
global memory
grid
shared memory
线程块
线程
registers
线程
local memory
一个SM可以同时拥有多个blocks,但需要序列执行
一个kernel其实由一个grid来执行,一个kernel一次只能在一个GPU上执行
kernel调用时也必须通过执行配置<<<grid, block>>>来指定kernel所使用的网格维度和线程块维度
CUDA编程之快速入门 - Madcola - 博客园
15 annotations
www.cnblogs.com
480
由于托管内存自动进行数据传输,这里要用cudaDeviceSynchronize()函数保证device和host同步
使用cudaMallocManaged函数分配托管内存
block大小一般要设置为32的倍数
一个线程块只能在一个SM上被调度。SM一般可以调度多个线程块
gridDim
blockDim
threadIdx
blockIdx
多线程
一个线程块里面包含很
kernel在device上执行时实际上是启动很多线程
但可和__device__,此时函数会在device和host都编译
host不会等待kernel执行完就执行下一步
threadIdx
每一个线程
核函数用__global__符号声明
用<<<grid, block>>>来指定kernel要执行的线程数量
device指代GPU及其内存
host指代CPU及其内存
CUDA编程入门极简教程 - 知乎
19 annotations
zhuanlan.zhihu.com
460
most non-trivial code has a "sweet spot" in the 128-512 threads per block range
number of threads per block should be a round multiple of the warp size, which is 32
16kb/48kb/96kb of shared memory
Each block
8k/16k/32k/64k/32k/64k/32k/64k/32k/64k registers
Each block
[1024,1024,64]
each block
s
1024 thread
Each block
n
cannot have more tha
performance - How do I choose grid and block dimensions for CUDA kernels? - Stack Overflow
13 annotations
stackoverflow.com
352
If you're designing something that requires a lot of shared memory, then more threads-per-block might be advantageous
dimensions - CUDA determining threads per block, blocks per grid - Stack Overflow
1 annotation
stackoverflow.com
449
allows Tensor objects to be zero-dimensional
Scalars can be implicitly constructed from C++ number types
cannot be resized
from_blob
32
32
may lead to overflows
quite a bit faster on CUDA
.packed_accessor64<float,2>()
.accessor<float,2>()
recommended to use accessors for CPU tensors and packed accessors for CUDA tensors
accessors are not compatible with CUDA tensors
Accessors are temporary views of a Tensor. They are only valid for the lifetime of the tensor that they view and hence should only be used locally in a function, like iterators.
Accessors then expose an API for accessing the Tensor elements efficiently.
ATen’s API is auto-generated from the same declarations PyTorch uses so the two APIs will track each other over time.
Tensor Basics — PyTorch master documentation
15 annotations
pytorch.org
407
in the current function or directory scope
set — CMake 3.18.2 Documentation
1 annotation
cmake.org
498
named <target> must have been created by a command such as add_executable() or add_library()
target_link_libraries — CMake 3.18.2 Documentation
1 annotation
cmake.org
479
每个filter i 都自己有一个rule book
论文阅读:Submanifold Sparse Convolutional Networks - 知乎
1 annotation
zhuanlan.zhihu.com
493
option
add_library
CMAKE_CXX_STANDARD 11
TutorialConfig.h.in
VERSION 1.0
Upper, lower, and mixed case commands are supported by CMake
CMake Tutorial — CMake 3.18.2 Documentation
6 annotations
cmake.org
407
All members of the exponential family have conjugate priors
if the posterior distributions p(θ | x) are in the same probability distribution family as the prior probability distribution p(θ), the prior and posterior are then called conjugate distributions, and the prior is called a conjugate prior for the likelihood function p(x | θ)
Conjugate prior - Wikipedia
2 annotations
en.wikipedia.org
654
they all have a normal distribution with mean and variance
the covariance matrix of is diagonal implies that the entries of are mutually independent
mean equal to and covariance matrix equal to
is an unobservable error term
is the dependent variable
is the vector of regression coefficients
is a vector of regressors
Linear regression - Maximum likelihood estimation
7 annotations
www.statlect.com
480
在贝叶斯概率论中,如果后验分布 p(θx)与先验概率分布 p(θ)在同一概率分布族中,则先验和后验称为共轭分布,先验称为似然函数的共轭先验
(16 封私信 / 16 条消息) 首页 - 知乎
1 annotation
www.zhihu.com
632
Share
Share